(navigation image)
Home American Libraries | Canadian Libraries | Universal Library | Community Texts | Project Gutenberg | Children's Library | Biodiversity Heritage Library | Additional Collections
Search: Advanced Search
Anonymous User (login or join us)
Upload
See other formats

Full text of "Schaum's Theory & Problems of Linear Algebra"

SCHAUM'S OUTLINE SERIES 

THEORV AND PROBLEMS OF 




SCHAVM'S OUTLINE OF 



THEORY AXD PROBLEMS 



OF 



-v 



LINEAR 
ALGEBRA 



•ed 



BY 



SEYMOUR LIPSCHUTZ, Ph.D. 

Associate Professor of Mathematics 
Temple University 



SCHAIJM'S OIJTUl^E SERIES 

McGRAW-HILL BOOK COMPANY 

New York, St. Louis, San Francisco, Toronto, Sydney 




Copyright © 1968 by McGraw-Hill, Inc. All Rights Reserved. Printed in the 
United States of America. No part of this publication may be reproduced, 
stored in a retrieval system, or transmitted, in any form or by any means, 
electronic, mechanical, photocopying, recording, or otherwise, without the 
prior written permission of the publisher. 



37989 



8910 8HSH 754321 







liX)mOM ^Q^fv^oiA 




Preface 

Linear algebra has in recent years become an essential part of the mathematical 
background required of mathematicians, engineers, physicists and other scientists. 
This requirement reflects the importance and wide applications of the subject matter. 

This book is designed for use as a textbook for a formal course in linear algebra 
or as a supplement to all current standard texts. It aims to present an introduction to 
linear algebra which will be found helpful to all readers regardless of their fields of 
specialization. More material has been included than can be covered in most first 
courses. This has been done to make the book more flexible, to provide a useful book 
of reference, and to stimulate further interest in the subject. 

Each chapter begins with clear statements of pertinent definitions, principles and 
theorems together with illustrative and other descriptive material. This is followed 
by graded sets of solved and supplementary problems. The solved problems serve to 
illustrate and amplify the theory, bring into sharp focus those fine points without 
which the student continually feels himself on unsafe ground, and provide the repetition 
of basic principles so vital to effective learning. Numerous proofs of theorems are 
included among the solved problems. The supplementary problems serve as a complete 
review of the material of each chapter. 

The first three chapters treat of vectors in Euclidean space, linear equations and 
matrices. These provide the motivation and basic computational tools for the abstract 
treatment of vector spaces and linear mappings which follow. A chapter on eigen- 
values and eigenvectors, preceded by determinants, gives conditions for representing 
a linear operator by a diagonal matrix. This naturally leads to the study of various 
canonical forms, specifically the triangular, Jordan and rational canonical forms. 
In the last chapter, on inner product spaces, the spectral theorem for symmetric op- 
erators is obtained and is applied to the diagonalization of real quadratic forms. For 
completeness, the appendices include sections on sets and relations, algebraic structures 
and polynomials over a field. 

I wish to thank many friends and colleagues, especially Dr. Martin Silverstein and 
Dr. Hwa Tsang, for invaluable suggestions and critical review of the manuscript. 
I also want to express my gratitude to Daniel Schaum and Nicola Monti for their very 
helpful cooperation. 

Seymour Lipschutz 
Temple University 
January, 1968 



CONTENTS 



Chapter 1 VECTORS IN R" AND C" 

Introduction. Vectors in R«. Vector addition and scalar multiplication. Dot 
product. Norm and distance in R". Complex numbers. Vectors in C«. 



Page 
1 



Chapter 2 LINEAR EQUATIONS 18 

Introduction. Linear equation. System of linear equations. Solution of a sys- 
tem of linear equations. Solution of a homogeneous system of linear equations. 



Chapter 3 MATRICES 35 

Introduction. Matrices. Matrix addition and scalar multiplication. Matrix 
multiplication. Transpose. Matrices and systems of linear equations. Echelon 
matrices. Row equivalence and elementary row operations. Square matrices. 
Algebra of square matrices. Invertible matrices. Block matrices. 



Chapter 4 VECTOR SPACES AND SUBSPACES 

Introduction. Examples of vector spaces. Subspaces. Linear combinations, 
linear spans. Row space of a matrix. Sums and direct sums. 



63 



Chapter 5 BASIS AND DIMENSION 86 

Introduction. Linear dependence. Basis and dimension. Dimension and sub- 
spaces. Rank of a matrix. Applications to linear equations. Coordinates. 



Chapter B LINEAR MAPPINGS 121 

Mappings. Linear mappings. Kernel and image of a linear mapping. Singular 
and nonsingular mappings. Linear mappings and systems of linear equations. 
Operations with linear mappings. Algebra of linear operators. Invertible 
operators. 



Chapter 7 MATRICES AND LINEAR OPERATORS 150 

Introduction. Matrix representation of a linear operator. Change of basis. 
Similarity. Matrices and linear mappings. 



Chapter 8 DETERMINANTS 171 

Introduction. Permutations. Determinant. Properties of determinants. Mi- 
nors and cofactors. Classical adjoint. Applications to linear equations. Deter- 
minant of a linear operator. Multilinearity and determinants. 



Chapter 9 EIGENVALUES AND EIGENVECTORS 197 

Introduction. Polynomials of matrices and linear operators. Eigenvalues and 
eigenvectors. Diagonalization and eigenvectors. Characteristio polynomial, 
Cayley-Hamilton theorem. Minimum polynomial. Characteristic and minimum 
polynomials of linear operators. 



CONTENTS 

Page 

Chapter 10 CANONICAL FORMS 222 

Introduction. Triangular form. Invariance. Invariant direct-sum decom- 
positions. Primary decomposition. Nilpotent operators, Jordan canonical 
form. Cyclic subspaces. Rational canonical form. Quotient spaces. 

Chapter 11 LINEAR FUNCTION ALS AND THE DUAL SPACE 249 

Introduction. Linear functionals and the dual space. Dual basis. Second dual 
space. Annihilators. Transpose of a linear mapping. 

Chapter 12 BILINEAR, QUADRATIC AND HERMITIAN FORMS 261 

Bilinear forms. Bilinear forms and matrices. Alternating bilinear forms. 
Symmetric bilinear forms, quadratic forms. Real symmetric bilinear forms. 
Law of inertia. Hermitian forms. 

Chapter IB INNER PRODUCT SPACES 279 

Introduction. Inner product spaces. Cauchy-Schwarz inequality. Orthogo- 
nality. Orthonormal sets. Gram-Schmidt orthogonalization process. Linear 
functionals and adjoint operators. Analogy between A(V) and C, special 
operators. Orthogonal and unitary operators. Orthogonal and unitary mat- 
rices. Change of orthonormal basis. Positive operators. Diagonalization and 
canonical forms in Euclidean spaces. Diagonalization and canonical forms in 
unitary spaces. Spectral theorem. 

Appendix A SETS AND RELATIONS 315 

Sets, elements. Set operations. Product sets. Relations. Equivalence 
relations. 



Appendix B ALGEBRAIC STRUCTURES 320 

Introduction. Groups. Rings, integral domains and fields. Modules. 

AppendixC POLYNOMIALS OVER A FIELD 327 

Introduction. Ring of polynomials. Notation. Divisibility. Factorization. 

INDEX 331 



chapter 1 



Vectors in R^ and C 



INTRODUCTION 

In various physical applications there appear certain quantities, such as temperature 
and speed, which possess only "magnitude". These can be represented by real numbers and 
are called scalars. On the other hand, there are also quantities, such as force and velocity, 
which possess both "magnitude" and "direction". These quantities can be represented by 
arrows (having appropriate lengths and directions and emanating from some given ref- 
erence point O) and are called vectors. In this chapter we study the properties of such 
vectors in some detail. 



We begin by considering the following operations on vectors. 



(i) 



(ii) 




Addition: The resultant u + v of two vectors u 
and V is obtained by the so-called parallelogram 
law, i.e. u + V is the diagonal of the parallelogram 
formed by u and v as shown on the right. 

Scalar multiplication: The product kn of a real 
number fc by a vector u is obtained by multiplying 
the magnitude of u by A; and retaining the same 
direction if k^O or the opposite direction if 
k<0, as shown on the right. 

Now we assume the reader is familiar with the representation of the points in the plane 
by ordered pairs of real numbers. If the origin of the axes is chosen at the reference point 
above, then every vector is uniquely determined by the coordinates of its endpoint. The 
relationship between the above operations and endpoints follows. 

(i) Addition: If (a, &) and (c, d) are the endpoints of the vectors u and v, then (a + c, b + d) 
will be the endpoint of u + v, as shown in Fig. (a) below. 



(a + c, b + d) 





(ka, kb) 



Fig. (a) 



Fig. (6) 



(ii) Scalar multiplication: If (a, b) is the endpoint of the vector u, then {ka, kb) will be the 
endpoint of the vector kn, as shown in Fig. (6) above. 



2 VECTORS IN B" AND C» [CHAP. 1 

Mathematically, we identify a vector with its endpoint; that is, we call the ordered pair 
(a, 6) of real numbers a vector. In fact, we shall generalize this notion and call an «-tuple 
{ai, C2, . . . , a«) of real numbers a vector. We shall again generalize and permit the co- 
ordinates of the «-tuple to be complex numbers and not just real numbers. Furthermore, in 
Chapter 4, we shall abstract properties of these %-tuples and formally define the mathe- 
matical system called a vector space. 

We assume the reader is familiar with the elementary properties of the real number 
field which we denote by R. 



VECTORS IN R" 

The set of all w-tuples of real numbers, denoted by R", is called n-space. A particular 
«-tuple in R", say 

U — (til, Uz, . . ., Un) 

is called a point or vector; the real numbers im are called the components (or: coordinates) 
of the vector u. Moreover, when discussing the space R" we use the term scalar for the 
elements of R, i.e. for the real numbers. 

Example 1.1: Consider the following vectors: 

(0,1), (1,-3), (1, 2, VS, 4), (-5, -1, 0,ff) 

The first two vectors have two components and so are points in B^; the last two 
vectors have four components and so are points in B*. 

Two vectors u and v are eqtial, written u = v, if they have the same number of com- 
ponents, i.e. belong to the same space, and if corresponding components are equal. The 
vectors (1, 2, 3) and (2, 3, 1) are not equal, since corresponding elements are not equal. 

Example 1.2: Suppose (x-y, x + y, z-1) = (4, 2, 3). Then, by definition of equality of vectors, 

X — y = 4: 
x + y = 2 

2-1 = 3 

Solving the above system of equations gives x = 3, y = —1, and z — A. 

VECTOR ADDITION AND SCALAR MULTIPLICATION 

Let u and v be vectors in R": 

u = (Ml, U2, . . . , Un) and v = {Vi, Vz, ■■ ., Vn) 
The sum of u and v, written u + v,is the vector obtained by adding corresponding components: 

U + V = iUi + Vi,U2 + V2, . ■ .,Un + Vn) 

The product of a real number fc by the vector u, written ku, is the vector obtained by multi- 
plying each component of u by k: 

ku — (kui, ku2, . . . , kun) 

Observe that u + v and ku are also vectors in R". We also define 

-u = -1m and u- v — m + {-v) 
The sum of vectors with different numbers of components is not defined. 



CHAP. 1] VECTORS IN K« AND C" 3 

Example 1.3: Let u = (1,-3,2,4) and v = (3,5,-1,-2). Then 

u + V - (1 + 3, -3 + 5, 2 - 1, 4 - 2) - (4, 2, 1, 2) 

5w = (5 • 1, 5 • (-3), 5 • 2, 5 • 4) ^ (5, -15, 10, 20) 

2m - 30) = (2, -6, 4, 8) + (-9, -15, 3, 6) = (-7, -21, 7, 14) 

Example 1.4: The vector (0, 0, . . ., 0) in P.", denoted by 0, is called the zero vector. It is similar 
to the scalar in that, for any vector u = (ztj, %, . . . , u„), 

U + = (Ml + 0, M2 + 0, • ■ • , M„ + 0) = (Ml, 2*2 uj = « 

Basic properties of the vectors in R" under the operations of vector addition and scalar 
multiplication are described in the following theorem. 

Theorem 1.1: For any vectors u,v,w G R" and any scalars k, k' G R: 

(i) (u + v) + w = u + {v + w) (v) k(u + v) - ku + kv 
(ii) u + — u (vi) (ft + k')u =■ ku + k'u 

(iii) u + {-u) = (vii) (kk')u = k{k'u) 

(iv) u + v — V + u (viii) Vu, — u 

Remark: Suppose u and v are vectors in R" for which u — kv for some nonzero scalar 
A; e R. Then u is said to be in the same direction as v if fe > 0, and in the op- 
posite direction if k <0. 

DOT PRODUCT 

Let u and v be vectors in R": 

u = (ui, Ui, . . . , t(„) and v = (vi, Vz, . . ., Vn) 

The dot or inner product of u and v, denoted hy wv, is the scalar obtained by multiplying 
corresponding components and adding the resulting products: 

U'V = UiVi + U^Vi, + • • • + UnVn 

The vectors u and v are said to be orthogonal (or: perpendicular) if their dot product is 
zero: m • v = 0. 

Example 15: Let m = (1, -2, 3, -4), -y = (6, 7, 1, -2) and w = (5, -4, 5, 7). Then 

u-v = 1-6 + (-2)'7 + 3-1 + (-4)'(-2) = 6-14 + 3 + 8 = 3 
M'W = 1'5 + (-2) •(-4) + 3-5 + (-4) -7 = 5 + 8 + 15-28 = 
Thus u and w are orthogonal. 
Basic properties of the dot product in R" follow. 
Theorem 1.2: For any vectors u.v.w G R" and any scalar fc € R: 
(i) {u + v)"W =^ u-w + vw (iii) wv = vu 

(ii) {ku) ' V = k{u • v) (iv) u-u^O, and wm = iff u-d 

Remark: The space R" with the above operations of vector addition, scalar multiplication 
and dot product is usually called Euclidean n-space. 

NORM AND DISTANCE IN R» 

Let u and v be vectors in R": u = (uuUz, .. .,Vm) and v = (vi,V2, . . .,Vn). The dis- 
tance between the points m and v, written d{u, v), is defined by 

d(U,V) = \/(«l - '^i? + {U2-V2)^+ ■■■ +(Un- Vn)''' 



VECTORS IN K" AND C« 



[CHAP. 1 



The norm (or: length) of the vector u, written ||m||, is defined to be the nonnegative square 

root ot U'u: . 

\\u\\ = y/u'U = yul + ul+ • • • +ul 

By Theorem 1.2, wu^O and so the square root exists. Observe that 

d{u,v) — \\u — v\\ 

Example 1.6: Let m = (1,-2, 4,1) and v = (3, 1, -5, 0). Then 

d(u,v) = V(l - 3)2 + (-2 - 1)2 + (4 + 5)2 + (1 - 0)2 = ^95 
\\v\\ - V32 + 12 + (-5)2 + 02 = V35 

Now if we consider two points, say p — (a, b) and q = (c, d) in the plane R2, then 
||p|l = Vo^TF and d{p,q) = V(a - c)" + (& - <i)' 

That is, ||p|| corresponds to the usual Euclidean length of the arrow from the origin to the 
point p, and d{p, q) corresponds to the usual Euclidean distance between the points p and 
q, as shown below: 



P = (a, b) 




1~ 
i-d| 




(a, 6) 



1 9 = (c, d) 



I 



H 1« - e\ — •\ 



A similar result holds for points on the line R and in space R*. 

Remark: A vector e is called a unit vector if its norm is 1: ||e|| = 1. Observe that, for 
any nonzero vector m G R", the vector eu — u/\\u\\ is a unit vector in the same 
direction as u. 

We now state a fundamental relationship known as the Cauchy-Schwarz inequality. 
Theorem 1.3 (Cauchy-Schwarz): For any vectors u,v G R", \u-v\^ \\u\\ \\v\\. 

Using the above inequality, we can now define the angle 6 between any two nonzero 
vectors u,v GW by ^ , ^ 

cos 6 - 



U V 



Note that if u-v = 0, then 9 
definition of orthogonality. 



90° (or: 6 = ir/2). This then agrees with our previous 



COMPLEX NUMBERS 

The set of complex numbers is denoted by C. Formally, a complex number is an 
ordered pair (a, b) of real numbers; equality, addition and multiplication of complex num- 
bers are defined as follows: 

(a, b) = (c, d) iff a- c and b = d 

{a,b) + {c,d) = {a + c,b + d) 

(a, b)(c, d) = {ac - bd, ad + be) 



CHAP. 1] 



VECTORS IN K« AND C" 



We identify the real number a with the complex number (a, 0): 

a <-> (a, 0) 

This is possible since the operations of addition and multiplication of real numbers are 
preserved under the correspondence: 

(a, 0) + (b, 0) = (a + b, 0) and (a, 0)(6, 0) = {ab, 0) 
Thus we view E as a subset of C and replace (a, 0) by o whenever convenient and possible. 
The complex number (0, 1), denoted by i, has the important property that 
4-2 = a = (0, 1)(0, 1) = (-1, 0) = -1 or i = \/=l 
Furthermore, using the fact 

(a, 6) = (a, 0) + (0,b) and (0,6) = (b, 0)(0, 1) 

we have (a, 6) = (a, 0) + (6, 0)(0, 1) = a + bi 

The notation a + bi is more convenient than (a, b). For example, the sum and product of 
complex numbers can be obtained by simply using the commutative and distributive laws 

9.71(1 7,^ ^ 1* 

{a + bi) + (c + di) - a + c + bi + di - {a + c) + {b + d)i 

(a + bi){c + di) = ac + bci + adi + bdi^ - {ac - bd) + {be + ad)i 

The conjugate of the complex number z = (a,b) = a + bi is denoted and defined by 

z = a — bi 

(Notice that zz = a^ + bK) If, in addition, « # 0, then the inverse z-^ of z and division by 

z are given by 

z_ a . —b . , w 

zz 



Z-1 = 



a^ + b^ 

where w GC. We also define 

— z = —Iz 



+ 



a' + b^ 



and 



= wz 



Example 1.7: Suppose z ■ 



and w - z = w + (-z) 

= 2 + Si and w = 5-2i. Then 

z + w = (2 + 3i) + (5 - 2t) = 2 + 5 + 3i - 2t = 7 + i 
zw = (2 + 3i)(5 - 2i) = 10 + 15t - 4i - 6t2 = 16 + Hi 
3i 



z = 2 + Si = 2 
w 5 — 2i 



and 



w = 5 - 2t = 5 + 2i 



2 + 3t 



(5 - 2i)(2 - 3i) _ 4-19t _ i. _ 31 • 
(2 + 3i)(2-3t) 13 13 13^ 



Just as the real numbers can be represented by the 
points on a line, the complex numbers can be represented 
by the points in the plane. Specifically, we let the point 
(a, b) in the plane represent the complex number z = a + bi, 
i.e. whose real part is a and whose imaginary part is b. The 
absolute value of z, written |z|, is defined as the distance 
from z to the origin: 

\z\ = V^T&^ 

Note that |z| is equal to the norm of the vector (a, 6). Also, \z 

Example 1.8: Suppose z-2 + 3i and w = 12 - 5i. Then 

1^1 = V4 + 9 = v'iS and |wl 




ZZ. 



= V144 + 25 = 13 



6 VECTORS IN R« AND C" [CHAP. 1 

Remark: In Appendix B we define the algebraic structure called a field. We emphasize 
that the set C of complex numbers with the above operations of addition and 
multiplication is a field. 

VECTORS IN C" 

The set of all n-tuples of complex numbers, denoted by C", is called complex n-space. 
Just as in the real case, the elements of C" are called points or vectors, the elements of C 
are called scalars, and vector addition in C" and scalar multiplication on C" are given by 

(Zl, Z2, . . ., Zn) + (Wl, W2, . . ., Wn) = (^^l + Wi, Z2 + Wi, . . ., Z„ + W„) 
Z(2l, 22, . . .,Zn) = {ZZi, 222, . . . , ZZn) 

where Zi, wi, z G C. 

Example 1.9: (2 + 3i, 4-i, 3) + (3 -2i, 5i 4 - 6i) = (5 + t, 4 + 4i, 7 - 6t) 
2i(2 + 3i, 4 - i, 3) = (-6 + 4i, 2 + 8i, 6i) 

Now let u and v be arbitrary vectors in C": 

U — (2i, 22, . . . , Zn), V = {Wi, Wi, . . . , Wn), 2;, Wi G C 

The dot, or inner, product of u and v is defined as follows: 

U'V = ZiWl + Z%W% + • • • + ZnWn 

Note that this definition reduces to the previous one in the real case, since Wi = Wi when 
Wi is real. The norm of u is defined by 



||m|| = yJU'U = \/ZiZi + 2222 + • • • + 2„2„ = V'|2ip + |22p + • • • + |2„|2 

Observe that wu and so ||«|| are real and positive when u¥=0, and when u = 0. 
Example 1.10: Let m = (2 + 3i, 4-i, 2i) and v = {S -2i, 5, 4- 61). Then 



u-v = (2 + 3i)(3 - 2i) + (4 - iXS) + (2i)(4 - 6i) 
= (2 + 3i)(3 + 2t) + (4 - 1)(5) + (2i)(4 + 6t) 
= 13i + 20 - 5t - 12 + 8i = 8 + 16i 



u'u = (2 + 3t)(2 + 3i) + (4 - i)(4 -i) + (2i)(2t) 
= (2 + 3i)(2 - 3i) + (4 - i)(4 + i) + (2i)(-2i) 
= 13 + 17 + 4 = 34 

||m|| - Vu'u = \/34 

The space C" with the above operations of vector addition, scalar multiplication and dot 
product, is called complex Euclidean n-space. 

Remark: If wv were defined by u-v = ziWi + ••• + ZnWn, then it is possible for 
U'U-0 even though u¥-0, e.g. if u={l,i,0). In fact, w% may not even 
be real. 



CHAP. 1] VECTORS IN R« AND C» 7 

Solved Problems 

VECTORS IN R" 

1.1. Compute: (i) (3,-4,5) + (1,1,-2); (ii) (1,2,-3) + (4,-5); (iii) -3(4,-5,-6); 
(iv) -(-6,7,-8). 

(i) Add corresponding components: (3, -4, 5) + (1, 1, -2) = (3 + 1,-4 + 1,5-2) = (4, -3, 3). 

(ii) The stim is not defined since the vectors have different numbers of components. 

(iii) Multiply each component by the scalar: —3(4, —5, —6) = (—12, 15, 18). 

(iv) Multiply each component by —1: — (—6, 7, —8) = (6, —7, 8). 

1.2. Let u^ (2, -7, 1), v = (-3, 0, 4), w = (0, 5, -8). Find (i) 3% - 4v, (ii) 2u + Zv- 5w. 

First perform the scalar multiplication and then the vector addition. 

(i) 3u-4v = 3(2, -7, 1) - 4(-3, 0, 4) = (6, -21, 3) + (12, 0, -16) = (18, -21, -13) 

(ii) 2u + 3v-5w = 2(2, -7, 1) + 3(-3, 0, 4) - 5(0, 5, -8) 
= (4, -14, 2) + (-9, 0, 12) + (0, -25, 40) 
= (4 - 9 + 0, -14 + - 25, 2 + 12 + 40) = (-5, -39, 54) 

1.3. Find x and y if {x, 3) = {2,x + y). 

Since the two vectors are equal, the corresponding components are equal to each other: 

X = 2, 3 = X + y 
Substitute x = 2 into the second equation to obtain y = 1. Thus x — 2 and j/ = 1. 

1.4. Find x and y if (4, y) = x(2, 3). 

Multiply by the scalar x to obtain (4, y) = x{2, 3) = (2x, Zx). 

Set the corresponding components equal to each other: 4 = 2x, y = 3a;. 

Solve the linear equations for x and y: x = 2 and y = 6. 

1.5. Find x, y and z if (2, -3, 4) = x{l, 1, 1) + y{l, 1, 0) + z(l, 0, 0). 

First multiply by the scalars x, y and z and then add: 

(2,-3,4) = a;(l, 1, 1) + j/(l, 1, 0) + «(1, 0, 0) 

= {X, X, x) + (y, y, 0) + (z, 0, 0) 

= (x-^y + z,x-\-y,x) 

Now set the corresponding components equal to each other: 

a; + j/ + z = 2, X -\- y = —3, a; = 4 

To solve the system of equations, substitute k = 4 into the second equation to obtain 4 + j/ = — 3 
or 3/ = —7. Then substitute into the first equation to find z = 5. Thus x — 4, y = —7, z = 5. 

1.6. Prove Theorem 1.1: For any vectors u,v,w GW and any scalars fc, fc'SR, 

(i) {u + v) + w — u-\- {v + w) (v) k{u + v) — ku + kv 
(ii) u + — u (vi) (fc + k')u — ku + k'u 

(iii) u + {-u) - (vii) (kk')u - k{k'u) 

(iv) u + V = V + u (viii) lu = u 

Let Wj, Vj and Wj be the ith components of u, v and w, respectively. 



8 VECTORS IN R" AND C» [CHAP. 1 

(i) By definition, Mj + Vi is the ith component oi u + v and so (itj + Vj) + Wj is the tth component 

of (u + v) + w. On the other hand, Vi + Wj is the ith component oi v + w and so % + (Vj + Wj) 
is the tth component of u + (v + w). But Mj, Vj and Wj are real numbers for which the as- 
sociative law holds, that is, 

(Ui + Vi) + Wi - Mj-l-(Di + Wj) for i-\,...,n 

Accordingly, (m + 1)) + tf = u+ {v-^w) since their corresponding components are equal. 
(ii) Here, = (0, 0, . . ., 0); hence 

M + = (Ml, M2. • • •. Mn) + (0, 0, . . ., 0) 

= (Ml + 0, Mg + 0, . . . , M„ + 0) = (Ml, M2, . . . , M„) = M 

(iii) Since — m = — 1(mi, M2, . . . , m„) = (— Mi, — M2, . . . , — m„), 

M + (— m) = (Ml, Mg, . . . , M„) + (—Ml, — M2, . . . , — M„) 

= (Ml - Ml, M2 - M2, . . . , M„ - M„) = (0, 0, . . . , 0) = 

(iv) By definition, n^ + v^ is the ith component of u + v, and Vj + Mj is the ith. component of v + u. 
But Mj and ijj are real numbers for which the commutative law holds, that is, 

Wi + ft = ■"( + Mi, i = 1, . . . , w 

Hence m + v = 1; + m since their corresponding components are equal. 

(v) Since Mj + Vj is the ith component of u + v, k(Ui + Vi) is the ith component of k(u + v). Since 
fcMj and kvi are the ith components of ku and kv respectively, Ajm; + fcvj is the ith component 
of ku + kv. But k, Mj and v^ are real numbers; hence 

fc(Mj + Vi) = ftMj + fc^j, i = 1, . . . , n 

Thus k{u + v) = ku + kv, as corresponding components are equal. 

(vi) Observe that the first plus sign refers to the addition of the two scalars k and k' whereas the 
second plus sign refers to the vector addition of the two vectors ku and k'u. 

By definition, (fe + fc')Mj is the ith component of the vector (k + k')u. Since fcMj and Aj'Mj 
are the ith components of ku and k'u respectively, kUf + k% is the ith component of ku + k'u. 
But k, k' and Mj are real numbers; hence 

(fc + k')Ui — kUi + k'Ui, i = 1, . . . , n 

Thus (k + k')u = ku + k'u, as corresponding components are equal. 

(vii) Since fc'Mj is the ith component of k'u, k(k'u^ is the ith component of fe(fe'M). But (fcfc')Mi is the 
ith component of (kk')u and, since fc, k' and Mj are real numbers, 

(fcfc')Mj = fc(fc'Mj), i=l, ...,n 

Hence (kk')u = k(k'u), as corresponding components are equal. 

(viii) 1 • M = 1(mi, M2, . . . , M„) - (1mi, lu^, .... 1m„) = (mi, M2, . . . , m„) = u. 

1.7. Show that Ou = for any vector u, where clearly the first is a scalar and the second 
a vector. 

Method 1: Om = 0(mi, M2, . . . , m„) = (Omi, OM2, . . . , Om„) = (0, 0, ...,0) = 
Method 2: By Theorem 1.1, Om = (0 + 0)m = Om + Om 

Adding — Om to both sides gives us the required result. 

DOT PRODUCT 

1.8. Compute u • v where: (i) u = (2, -3, 6), v = (8, 2, -3); (ii) u = (1, -8, 0, 5), v = (3, 6, 4); 
(iii) tt = (3,-5,2,l), v- (4, 1,-2, 5). 

(i) Multiply corresponding components and add: wv = 2 • 8 + (—3) • 2 + 6 • (—3) = —8. 

(ii) The dot product is not defined between vectors with different numbers of components. 

(iii) Multiply corresponding components and add: u-v = 3 • 4 + (—5) • 1 + 2 • (—2) + 1 • 5 = 8. 



CHAP. 1] VECTORS IN R" AND C" 9 

1.9. Determine k so that the vectors u and v are orthogonal where 
(i) u ^ (1, k, -3) and v = (2, -5, 4) 

(ii) u = (2, 3fc, -4, 1, 5) and v = (6> -1, 3, 7, 2fc) 

In each case, compute u • v, set it equal to 0, and solve for k. 
(i) u'v = 1 • 2 + fc • (-5) + (-3) -4 = 2 - 5fe - 12 = 0, -5k - 10 == 0, fc = -2 
(ii) U'V = 2-6 + 3fc'(-l) + (-4)'3 + 1*7 + 5'2fc 

= 12 - 3fe - 12 + 7 + lO/c = 0, k = -l 

1.10. Prove Theorem 1.2: For any vectors u,v,w G R" and any scalar kGK, 

(i) {u + v)'W = U'W + vw (iii) wv = i; • tt 

(ii) (fcM)-^ = k{u'v) (iv) M'M - 0, and u-u = iff u = 

Let M = (Mi,M2 O. •" = (■yi.'y2. • • •'■"n). W = (^1,^2. ■ ■ • . W„). 

(i) Since u + v = (mi + Vi, M2 + ■"2. •••.**„ + I'm). 

(u + v)'W = (Ml + Vi)Wi + (% + '"2)^2 + • • • + (U„ + Vn)Wn 

= UiWi + ViWi + U2W2 + 1'2M'2 + • • ■ + M„W„ + V„W„ 

= (MiWi + M2W2 + • • • + M„w„) + (viWi + V2W2 + • • • + ■y„w„) 

= U'W + VW 

(ii) Since ku = (ku^, ku^, . . ., ku^), 

(ku) • V = ku^Vi + fcM2'y2 + • • • + ku^V^ = HU^V^ + M2'y2 -I 1- '^n^n) = *=(«* ' ■") 

(iii) U'V = MjDi + M2''^2 + ■ • • + Mn''^n = '"l"! + ■''2'*2 + ' " ' + 1>n^n = V'U 

(iv) Since wf is nonnegative for each i, and since the sum of nonnegative real numbers is non- 
negative, 2 I 2 I J. ,.2 =- n 

Furthermore, u'U = iff Mj = for each i, that is, iff m = 0. 

DISTANCE AND NORM IN R» 

1.11. Find the distance d{u, v) between the vectors u and v where: (i) u = (1, 7), v = (6, -5); 
(ii) «=(3,-5,4), t; = (6,2,-1); (iii) m = (5,3,-2,-4,-1), ■?; = (2,-1,0,-7,2). 



In each case use the formula d(u, v) = VW - v{)^ + •••+(«„ - vj^ . 
(i) d(u,v) = V(l - 6)2 + (7 + 5)2 = V25 + 144 = \/l69 = 13 
(ii) d(u,v) = V(3 - 6)2 + (-5 - 2)2 + (4 + 1)2 = V9 + 49 + 25 = a/83 
(iii) d(u,v) = V(5 - 2)2 + (3 + 1)2 + (-2 + 0)2 + (-4 + 7)2 + (-1 - 2)2 = \/47 

1.12. Find k such that d{u, v) = 6 where « = (2, fc, 1, -4) and v = (3, -1, 6, -3). 

(d(u, i;))2 = (2 - 3)2 + (fe + 1)2 + (1 - 6)2 + (-4 + 3)2 = fe2 + 2fe + 28 
Now solve fc2 + 2fc + 28 = 62 to obtain fc = 2, -4. 

1.13. Find the norm ||m|| of the vector u if (i) u = (2, -7), (ii) u = (3, -12, -4). 



In each case use the formula ||m|1 = y/u^ + m2 4. . . . + ^2 , 
(i) IHI = V22 + (-7)2 = V4 + 49 = ^53 



(ii) 11^11 = V32 + (-12)2 + (_4)2 = V9 + 144 + 16 = V169 = 13 



10 VECTORS IN R" AND C» [CHAP. 1 

1.14. Determine & such that ||tt|| = VS^ where u = {l,k,-2,5). 

I|m|I2 = 12 + fc2 + (-2)2 + 52 = A;2 + 30 
Now solve fc2 + 30 = 39 and obtain fc = 3, -3. 

1.15. Show that ||m|| ^ 0, and ||m|| = ifi u = 0. 

By Theorem 1.2, wu — O, and u'u = iff m = 0. Since ||m|| = yjii-u, the result follows. 

1.16. Prove Theorem 1.3 (Cauchy-Schwarz): 

For any vectors u = {u\, . . . , m„) and v — (vi, . . .,Vn) in B", \u-v\ ^ ]\u\\ \\v\\ . 

n 

We shall prove the following stronger statement: \u'v\ — 2 |Mt'"tl ^ I|m|| HvH. 

If M = or V = 0, then the inequality reduces to — — and is therefore true. Hence we 
need only consider the case in which m # and v j^ 0, i.e. where ||m|| # and ||i;|| # 0. 
Furthermore, 

\U'V\ - IMjI)! + • • • + M„V„| ^ \UiVi\ + ••• + \u„vj = 2 kiVil 

Thus we need only prove the second inequality. 

Now for any real numbers w, 3/ G R, — (x — j/)2 = x^ — 2xy + y^ or, equivalently, 

2xy ^ a;2 + 3/2 (1) 

Set X = |mj|/||m|| and y = Ifil/HvH in (1) to obtain, for any i, 

IHI IHI IMP IWP ^^' 

But, by definition of the norm of a vector, ||m|| = 2m,^ = 2 kiP and ||i;|| = S^f = 2|v,-|2. Thus 
summing {2) with respect to i and using \u(Vi\ = ImjI |i;j|, we have 

2M ^ 2kP 2 It'iP IMI^ IMP 

IMI IHI IMP IMP IMP IblP 

2 ki^il 

that is, II ,1 ,1 11 - 1 

IMI IHI 

Multiplying both sides by ||m|| H'wH, we obtain the required inequality. 

1.17. Prove Minkowski's inequality: 

For any vectors u-{ui,...,Un) and v = {vi, . . .,Vn) in R", ||m + vH =^ ||tt|| + ||v||. 

If IIm + vjI = 0, the inequality clearly holds. Thus we need only consider the case ||m + i;1| #0. 
Now JMj + V(| — jwjl + |i)j| for any real numbers mj, Vj G R. Hence 

\\u + v\\2 = 2(«i + i'<)=' = 2k + i'iP 

= 2 ki + vil \ui + Vjj ^ 2 ki + Vil (kil + M) 
= 2 ki + Vil \ui\ + 2 ki + Vil Ivjj 

But by the Cauchy-Schwarz inequality (see preceding problem), 

2M+fj|KI ^ Ik+^IIIMI and 2k + 'yilkl ^ Ik + HIIHI 

Thus ||M + f||2 ^ |Im + i;|| IHI + ||m + H| ||i;|l = ||tt + i;|| (IMI + lbll) 

Dividing by ||m-|- v||, we obtain the required inequality. 



CHAP. 1] VECTORS IN R» AND C« 11 

1.18. Prove that the norm in R" satisfies the following laws: 
[Ni]: For any vector w, ||m||^0; and |H| = iff u = 0. 
[Nz]: For any vector M and any scalar A;, \\ku\\ = |/cl ||m||. 
[Na]: For any vectors u and v, \\u + v\\ ^ \\u\\ + ||t;||. 

[Ni] was proved in Problem 1.15, and [Ng] in Problem 1.17. Hence we need only prove that 
[Ni] holds. 

Suppose u = (ui, ii2, . . ., u„) and so ku = (kui, ku^, .... ku^. Then 

||fcMl|2 = (fcMi)2 + (kui)^ + ••• + (fcM„)2 = khi\ + khi\ + ••• + khil 

The square root of both sides of the equality gives us the required result. 



COMPLEX NUMBERS 

1.19. Simplify: (i) (5 + 3i)(2-7i); (ii) (4-3*^; ("i) g"^'' (i^) |^; (v) »', i*. *" 



,-31. 



(vi) (l + 2i)''; (vii)(2^)' 



(i) (5 + 3t)(2 - 7i) = 10 + 6i - 35i - 21i? = 31 - 29i 

(ii) (4 - 3t)2 = 16 - 24t + 9t2 = 7 - 24t 

r"\ 1 _ (3 + 4t) _ 3 + 4i ^ A + Aj 

^*"^ 3 - 4i (3 - 4i)(3 + 4t) 25 25 25 

2-7i _ (2 - 7t)(5 - 3t) _ -11 - 41i _ _11_41. 
^'^' 5 + 3t ^ (5 + 3i)(5-3t) 34 34 34 

(v) ts = i^'i = (-l)t = -i; i* = v^"P = 1; P^ = (i*)7't^ = 1^ • (-t) = -i 
(vi) (1 + 2i)8 = 1 + 6i + 121* + 8i^ = 1 + 6i - 12 - 8i = -11 - 2i 

. / 1 Y _ 1 _ (-5 + 12t) _ -5 + 12i _ __5_ , 12 

(vu) \^2-3iJ ~ -5-12i ~ (-5 - 12t)(-5 + 12t) 169 169 

1.20. Let 2! = 2 - 3i and w = 4 + 5i Find: 

(i) z + w and zw; (ii) z/w; (iii) « and w; (iv) \z\ and \w\. 

(i) z + w = 2 - 3i + 4 + 5i = 6 + 2i 

zw = (2 - 3i)(4 + 5t) = 8 - 12i + lOi - 15t2 = 23 - 2i 
f\ £. - 2-3t _ (2 - 3i)(4 - 5i) _ -7 - 22i ^ _ 1. _ 22 . 



169 ' 169* 



w 



4 + 6i (4 + 5i)(4 - 5t) 41 41 41 



(iii) Use a+U = a-bi: « = 2 - 3t = 2 + 3t; w = 4 + 5i = 4 - 5t. 

(iv) Use |a+6i| = V^^TP: \z\ = |2 - 3t| = V4 + 9 = Vl3; \w\ = [4 + 5i| =: Vl6 + 25= Vil. 

1.21. Prove: For any complex numbers z,w G.C, 
(i) z + w = z + w, (ii) zw = zw, (iii) z = z. 

Suppose z = a+bi and w = e + di where a, b,c,d& R. 



(i) z + w = (a+bi) + (e + di} = {a+c) + {b + d)i 
= {a + e) — {b + d)i = a + c — bi — di 
= (a — bi) + {e — di) = z + w 



(ii) zw = (a+ bi)(c + di) = (ae — bd) + (ad+ bc)i 

= {ac — bd) — (ad + bc)i = (a-bi)(c — di) - zw 



(iii) z — a+bi — a — bi = a— (—b)i = a+bi = « 



12 VECTORS IN R« AND C" [CHAP. 1 

1.22. Prove: For any complex numbers z,w GC, \zw\ = \z\ \w\. 

Suppose z = a + bi and w = c + di where a, b,c,dG R. Then 

|«|2 = o2 + 62, |w|2 = c2 + d2, and zw = {ac - bd) + {ad + bc)i 

Thus |zw|2 = (ac - 6d)2 + (ad + 6c)2 

= a2c2 - 2abcd + b^d? + aU^ + 2abcd + 62c2 

= a^C^ + d2) + 62(c2 + d2) = ((j2 + 62)(c2 + (£2) = |2|2 |^|2 

The square root of both sides gives us the desired result. 

1.23. Prove: For any complex numbers z,w€lC, \z + w\ — \z\ + \w\. 

Suppose z = a + bi and w = c + di where a, 6, c, d € R. Consider the vectors u = (a, 6) and 
V = (c, d) in R2. Note that 

\z\ = Va2 + 62 = ||m||, jwl = Vc2 + rf2 = |j^|| 



and \z + w\ = \{a + c) + (b + d)i\ = V(« + c)2 + (6 + d)2 = ||(a+ c, 6 + d)|| = \\u + v\\ 
By Minkowski's inequality (Problem 1.17), ||m + v|| — ||m|| + \\v\\ and so 
\z + w\ - \\u + v\\ ^ ||m|| + ||t;|| = |z| + lw| 

VECTORS IN C" 

1.24. Let M = (3 - 2i, 4i, 1 + 6i) and v = {5 + i,2-3i, 5). Find: 
(i) u + v, (il) 4m, (iii) (1 + i)v, (iv) (1 - 2i)u + (3 + i)v. 

(i) Add corresponding components: u + v — (8 — t, 2 + i, 6 + 6t). 

(ii) Multiply each component of u by the scalar 4t: 4m = (8 + 12i, —16, —24 + 4t). 

(iii) Multiply each component of v by the scalar 1 + i: 

(1 + t)i; = (5 + 6i + i2, 2 - 1 - 3i2, 6 + 5t) = (4 + 6i, 5 - i, 5 + 5i) 
(iv) First perform the scalar multiplication and then the vector addition: 

(1 - 2i)u + (3 + i)v = (-1 - 8i 8 + 4i, 13 + 4i) + (14 + 8i, 9 - Ii, 15 + 5i) 
= (13, 17 - 3i, 28 + 9t) 

1.25. Find u-v and vu where: (i) u = {l — 2i,S + i), v = (4 + 2i, 5 — 6i); (ii) u 
(3 - 2i, 4i, 1 + 6i), v = (5 + i,2-Si,l + 2i). 

Recall that the conjugates of the second vector appear in the dot product: 

(2l, . . ., Z„) • (Wi, ...,WJ = «iWi + • • ■ + 2„W„ 



i 



(i) M • v = (1 - 2t)(4 + 2i) + (3 + t)(5 - 6t) 

= (1 - 2i)(4 - 2i) + (3 + j)(5 + 6t) = -lOi + 9 + 23i = 9 + 13i 



vu = (4 + 2t)(l - 2i) + (5 - 6i)(3 + 1) 

= (4 + 2t)(l + 2i) + (5 - 6z)(3 - i) = lOi + 9 - 23i = 9 - 13i 



(ii) u-v = (3 - 2i)(5 + i) + (4i)(2 - 3i) + (1 + 6i)(7 + 2i) 

= (3 - 2i)(5 - i) + (4i)(2 + 3i) + (1 + 6i)(7 - 2i) = 20 + 35i 



ij • tt = (5 + i)(3 - 2i) + (2 - 3t)(4i) + (7 + 2i)(l + 6t) 

= (5 + i)(3 + 2i) + (2 - 3i)(-4t) + (7 + 2i)(l - 6i) = 20 - 35i 

In both examples, vu — wv. This holds true in general, as seen in Problem 1.27. 



CHAP. 1] VECTORS IN R» AND C» 13 

1.26. Find ||tt|| where: (i) m = (3 + 4i, 5 - 2i, 1 - 3i); (ii) u={4- i, 2i, 3 + 2t, 1 - hi). 
Recall that zz — w''+ 6^ when z = a+ hi. Use 

||m|P - u-u = ZiZi + Z2Z2 + • • • + z„Zn where s = (z^, Z2 z„) 

(i) ||m1P = (3)2 + (4)2 + (5)2 + (-2)2 + (1)2 + (-3)2 = 64, or ||m|| == 8 

(ii) ||m||2 = 42 + (-1)2 + 22 + 32 + 22 + 12 + (-5)2 = 60, or ||m|| = ^60 = 2\/l5 



1.27. Prove: For any vectors u,v EC" and any scalar z GC, (i) wv = vu, (ii) (zu) • v 
z{u'v), (iii) u-{zv) = z{u'v). (Compare with Theorem 1.2.) 

Suppose u = {zi, «2, . . . , «„) and v = (wj, W2, • • ■ , w„). 
(i) Using the properties of the conjugate established in Problem 1.21, 



VU — WiZi + W2Z2 + ■ • • + W„Z„ = WiZi + W2Z2 + • • • + W„Zn 

= WiZ^ + W2Z2 + • • • + W„Z„ = Z^Wi + Z2W2 + • • • + z„w„ = W V 

(ii) Since zu = (zz^, zz2, . . ., zz^), 

(zu) • V = ZZiWi + ZZ2W2 + • • • + ZZ„Wn = Z(ZiWi + «2M'2 + " " " + ^nWn) = «^(m * '") 

(iii) Method 1. Since zv = (zwi, zwg, . . . , zwj, 

U • (zv) = ZiZWl + Z^Wl + • • • + Z„ZW^ = ZiZWi + Z2ZW2 + • • • + «„2W„ 
= «(«iWi + Z^2 + • • • + Z^W^ = Z(U • V) 

Method 2. Using (i) and (ii), 

u • (zv) = (zv) • u = z(vu) = z(vu) = z(u'v) 

MISCELLANEOUS PROBLEMS 

1.28. Let u = (3, -2, 1, 4) and v = (7, 1, -3, 6). Find: 

(i) u + v; (ii) 4u; (iii) 2u — Sv; (iv) u-v; (v) ||m|| and ||i;||; (vi) d{u,v). 

(i) u + v - (3 + 7,-2 + 1,1-3,4 + 6) = (10,-1,-2,10) 

(ii) 4m = (4'3, 4'(-2), 4-1, 4-4) = (12,-8,4,16) 

(iii) 2m - 3i; = (6, -4, 2, 8) + (-21, -3, 9, -18) = (-15, -7, 11, -10) 

(iv) u-v = 21-2-3 + 24 = 40 

(v) 1|m|1 = V9 + 4 + 1 + 16 = VSO, ||v|| = V'49 + 1 + 9 + 36 = \/95 

(vi) d(u,v) = V(3 - 7)2 + (-2 - 1)2 + (1 + 3)2 + (4 - 6)2 = V45 = 3\/5 

1.29. Let M = (7 - 2i, 2 + 5i) and v = (1 + i, -3 - 6i). Find: 

(i) u + v; (ii) 2m; (iii) {S-i)v; (iv) u-v; (v) ||mI| and ||v||. 

(i) u + v = (7 - 2i + 1 + i, 2 + 5i - 3 - 6z) = (8 - i, -1 - i) 

(ii) 2m = (14i - 4i2, 4t + 10t2) = (4 + 14i -10 + 4t) 

(iii) (3-i)v = (3 + 3i - i - 12, -9 - ISi + 3i + 6P) - (4 + 2t, -15 - 15t) 

(iv) u-v = (7 - 2t)(TTt) + (2 + 5i)(-3 - 6t) 

= (7 - 2t)(l - 1) + (2 + 5i)(-3 + 6i) = 5 - 9i - 36 - 3i = -31 - 12i 

(v) IMI = V72 + (-2)2 + 22 + 52 = V^, \\v\\ = Vl'^ + 1^ + (-3)2 + (-6)2 = V47 



14 



VECTORS IN R« AND C" 



[CHAP. 1 



1.30. Any pair of points P = {ou) and Q = (bi) in R» de- 
fines the directed line segment from P to Q, written 
PQ. We identify PQ with the vector v = Q-P: 

PQ = V = {bi — ai, 62 - a2, . . .,bn- a„) 

Find the vector v identified with PQ where: 

(i) P = (2,5), Q = (-3.4) 

(ii) P=(l,-2,4), Q = (6,0,-3) 

(i) t) = Q-P = (-3-2, 4-5) = (-5,-1) 

(ii) V = Q-P = (6 - 1, + 2, -3 - 4) = (6, 2, -7) 




1.31. The set H of elements in R" which are solutions of a linear equation in n unknowns 
Xi, . . .,Xn of the form 

CiXl + C2X2 + ■ • • + CnXn = & (*) 

with u = (ci, . . ., c„) ^ in R", is called a kyperplane of R", and (*) is called an equa- 
tion of H. (We frequently identify H with (*).) Show that the directed line segment 
PQ of any pair of points P,Q GH is orthogonal to the coefficient vector u; the vector 
u is said to be normal to the hyperplane H. 

Suppose P = («!, . . .,aj and Q = (fej, . . .,6„). Then the Oj and the 64 are solutions of the 
given equation: 

Cjai + C2a2 + • • ■ + c^an = b, c^bi + C262 + • • • + c„&„ = 6 

Let v = PQ = Q-P^ {b,~ai,b2-a^, ...,b„-a„) 

Then m • t; = Ci(&i - aj) + 62(63 - Og) + • • • + c„(6„ - aj 

= C161 - citti + C262 - C2a2 + • • • + c„fe„ - c„a„ 

= (Ci6i + C262 + • • • + c„bj - (CjOi + 02(12 + • • • + c„o„) = 6-6 = 

Hence v, that is, PQ, is orthogonal to u. 

1.32. Find an equation of the hyperplane H in R* if: (i) H passes through P = (3, -2, 1, -4) 
and is normal to m = (2,5,-6,-2); (ii) H passes through P = (1,-2, 3, 5) and is 
parallel to the hyperplane H' determined by 4:X — 5y + 2z + w = 11. 

(i) An equation of H is of the form 2x + 5y — Qz — 2w = k since it is normal to u. Substitute 
P into this equation to obtain k = —2. Thus an equation of H is 2x + 5y — 6z — 2w = —2. 

(ii) H and H' are parallel iff corresponding normal vectors are in the same or opposite direction. 
Hence an equation of H is of the form 4x — 5y + 2z + w = k. Substituting P into this equa- 
tion, we find k = 25. Thus an equation of H is 4:X — 5y + 2z + w = 25. 



1,33. The line I in R" passing through the point P = (a,) 
and in the direction of u= (Ui) ¥= consists of the 
points X — P + tu, t GK, that is, consists of the 
points X = (Xi) obtained from 

ai -I- Uit 



(*) 



Xi 



a;2 = ^2 -f- U2t 



n — an ] Unt 

where t takes on all real values. The variable t is 
called a parameter, and (*) is called a parametric rep- 
resentation of I. 




CHAP. 1] VECTORS IN B," AND C" 15 

(i) Find a parametric representation of the line passing through P and in the direc- 
tion of tt where: (a)P = (2,5) and m = (-3,4); (6) P = (4, -2, 3, 1) and u = 
(2,5,-7,11). 

(ii) Find a parametric representation of the line passing through the points P and Q 
where: (a)P = (7,-2) and Q = (9,3); (6) P = (5, 4, -3) and Q = (l,-3,2). 

(i) In each case use the formula (*). 

X = 4 + 2t 



'x = 2 - 3t 

\y = 5 + 4t 



y = -2+ 5t 
z = 3 - 7t 

w = 1 + nt 



(In K2 we usually eliminate t from the two equations and represent the line by a single 
equation: Ax + Zy = 23.) 

(ii) First compute u = PQ = Q — P. Then use the formula (*). 

(a) u = Q-P = (2,5) (b) u = Q-P = (-4,-7,5) 

(x = 1 + 2t 
= -2 + 5« 



X = 


5 -At 


V = 


4 -It 


z = 


-3 + 5t 



(Note that in each case we could also write u — QP — P — Q.) 



Supplementary Problems 



VECTORS IN R» 

1.34. Let u = (1,-2,5), i; = (3,l,-2). Find: (i) u + v; (ii) -6m; (iii) 2u - 5v; (iv) u-v; (v) ||m|1 
and ||t;||; (vi) d(u,v). 

1.35. Let w = (2, -1, 0, -3), i; = (1, -1, -i; 3), w = (1,3,-2,2). Find: (i) 2u - 3v; (ii) 5u - 3v - Aw; 
(iii) —u + 2v — 2w, (iv) u'v,u'W and vw; (v) d(u,v) and d(v,w). 

1.36. Let M = (2,1,-3,0,4), t; = (5, -3, -1, 2, 7). Find: (i) u + v; (ii) 3u - 2v; (iii) M'^y; (iv) ||mI| 
and ll-ull; (v) d(u,v). 

1.37. Determine & so that the vectors M and 1) are orthogonal, (i) u = (3,k,—2), t; = (6, — 4, — 3). (ii) u — 
(5, A;, -4, 2), i; = (1, -3, 2, 2fe). (iii) m = (1, 7, fe + 2, -2), v = (3,k,-S,k). 

1.38. Determine a; and 2/ if: (i) (x,x + y) = (y -2,6); (ii) x(l,2) = -4(y,3). 

1.39. Determine a; and J/ if : (i) x(3,2) = 2(y,-l); (ii) x(2,y) = y(l,-2). 

1.40. Determine a;, y and « if: 

(i) (3,-1,2) - a:(l, 1, 1) + j/(l, -1, 0) + z(l, 0, 0) 
(ii) (-1,3,3) = a;(l, 1, 0) + j/(0, 0, -1) + z(0, 1, 1) 

1.41. Let «! = (!, 0,0), 62 = (0,1,0), eg = (0,0,1). Show that for any vector u = (a,b,c) in &: 
(i) M = aei + 6e2 + ceg; (ii) m • ej = a, m • 62 = 6, m • 63 = c. 



16 VECTORS IN R« AND C» [CHAP. 1 

1.42. Generalize the result in the preceding problem as follows. Let ej e R" be the vector with 1 in the 
ith coordinate and elsewhere: 

ei = (1,0,0, ...,0,0), 62 = (0,1,0 0,0), ..., e„ = (0, 0,0, . . .,0, 1) 

Show that for any vector u — (aj, ag, . . .,«„), 

(i) u = aiCi + (1262 + • • • + a„e„, (ii) m • ej = Oj for i = 1, . ..,n 

1.43. Suppose M e K" has the property that m • 1) = for every v S R". Show that it = 0. 

1.44. Using d(u,v) = \\u-v\\ and the norm properties [ATj], [iVj] and [N3] in Problem 1.18, show that 
the distance function satisfies the following properties for any vectors u,v,w G R": 

(i) d(u,v)^Q, and d{u, v) = itl u = v; (ii) d(u,v) = d(v,u); (iii) d{u,w) ^ d{u,v) + d(v,w). 

COMPLEX NUMBERS 

1.45. Simplify: (i) (4 - 7t)(9 + 2i); (ii) (3-5i)2; (iii) j^^; (iv)|-3||; (v) (l-i)^. 

1.46. Simplify: (i) ^; (ii) f^; (iii) i^s, ^2^ ^^*; (iv) (^3^ .' 

1.47. Let z = 2-5i and w = 7 + 3i Find: (i) « + w; (ii) zw; (iii) «/«; (iv) «, w; (v) 1«|, lw| . 

1.48. Let z = 2 + i and w = 6 - 5t. Find: (i) z/w; (ii) «, w; (iii) 1«1, |w|. 

1.49. Show that: (i) ««-i = 1; (ii) \z\ = \z\; (iii) real part of « = ^(z + z); (iv) imaginary part of 
« = (z — z)/2i. 

1.50. Show that zw = implies z = or w = 0. 

VECTORS IN C» 

1.51. Let It = (1 + 7i, 2 - 6i) and v-{5- 2i, 3 - 4t). Find: (i) u + v; (ii) (3 + i)u; (iii) 2m + (4 - 7t)v; 
(iv) wwandvw; (v) 1|m1| and ||t)|l. 

1.52. Let M = (3 - 7t, 2i, -1 + i) and 1; = (4 - i, 11 + 2i, 8 - Si). Find: (i) m - v; (ii) (3 + i)v; (iii) 
tfvandvM; (iv) 1|m|| and Hiill . 

1.53. Prove: For any vectors u,v,w G C": 

(i) {u + v)-w = WW + vw; (ii) w (u + v) = wu + wv. (Compare with Theorem 1.2.) 

1.54. Prove that the norm in C" satisfies the following laws: 
[Ni]: For any vector u, \\u\\ ^ 0; and \\u\\ =0 iff m = 0. 

[A^a]: For any vector u and any complex number z, 1|zm|| = \z\ \\u\\. 
[Ns]: For any vectors u and v, \\u + v\\ ^ |H| + ||i'l|. 
(Compare with Problem 1.18.) 

MISCELLANEOUS PROBLEMS 

1.55. Find an equation of the hyperplane in R3 which: 

(i) passes through (2, —7, 1) and is normal to (3, 1, —11); 

(ii) contains (1, -2, 2), (0, 1, 3) and (0, 2, -1); 

(iii) contains (1, -5, 2) and is parallel to 3x - ly + iz = 5. 

1.56. Determine the value of k such that 2x - ky + 4z - 5w = 11 is perpendicular to 7x + 2y -z + 
2w = 8. (Two hyperplanes are perpendicular iff corresponding normal vectors are orthogonal.) 



CHAP. 1] VECTORS IN R» AND C 17 

1.57. Find a parametric representation of the line which: 

(i) passes through (7, —1, 8) in the direction of (1, 3, —5) 

(ii) passes through (1,9,-4,5) and (2,-3,0,4) 

(iii) passes through (4,-1,9) and is perpendicular to the plane Sx — 2y + z — 18 

1.58. Let P, Q and R be the points on the line determined by 

*i = Oi + Mjt, a;2 = 02 + Mgt, . . . , x^ — a^-\- u„t 
which correspond respectively to the values ti, ^2 ^^^ *3 ^or t. Show that if tj < (2 < Ht then 
d(P,Q) ■¥ d(Q,R) = d(P,R). 

Answers to Supplementary Problems 

1.34. (i) u + v = (4, -1, 3); (ii) -6m = (-6, 12, -30); (iii) 2m - 5i; = (-13, -9, 20); (iv) m • -y = -9; 
(v) ||m|| = V30, ||i;|| = a/U ; (vi) d{u,v) = ^62 

1.35. (i) 2u-Zv = (1, 1, 3, -15); (ii) 5u-3v-4w = (3, -14, 11, -32); (iii) -m + 2i; - 2w = (-2,-7, 2, 5); 
(iv) wv — —6, M • w = —7, vw = 6; (v) d(u, v) = \/38 , d(v, w) = 3'\/2 

1.36. (i) M + r = (7, -2, -4, 2, 11); (ii) 3u-2v = (-4, 9, -7, -4, -2); (iii) u-v = 38; (iv) ||m1| = VSO , 
||i;||=2V22; (v) d(M,i>) = V^ 

1.37. (i) k = 6; (ii) fc = 3; (iii) k = 3/2 

1.38. (i) » = 2, J/ = 4; (ii) a; = -6, j/ = 3/2 

1.39. (i) x = -l, y - -3/2; (ii) x = 0, y = 0; or w = -2, 2/ = -4 

1.40. (i) x = 2, y = S, g = -2; (ii) a; = -1, j/ = 1, » = 4 
1.43. We have that m • m = which implies that m = 0. 

1.45. (i) 50-55i; (ii) -16 - 30i; (iii) (4 + 7t)/65; (iv) (H-3i)/2; (v) -2 - 2i. 

1.46. (i) -^i; (ii) (5 + 27t)/58; (iii) -i, i, -1; (iv) (4 + 3i)/50. 

1.47. (i) z + w = 9 - 2t; (ii) zw = 29 - 29i; (iii) z/w = (-1 - 41t)/58; (iv) z = 2 + 5i, w = 7 - 3f, 

(v) |z| = V29, M = V5^- 

1.48. (i) z/w = (7 + 16t)/61; (ii) z = 2 - i, w = 6 + 5t; (iii) \z\ = y/5 , |m;| = Vei . 

1.50. If zw = 0, then \zw\ = \z\ \w\ - |0| = 0. Hence \z\ = or lw| =0; and so z = or w = 0. 

1.51. (i) M + t) = (6 + 5i, 5 - lOi) (iv) u-v - 21 + 27i, vu = 21 - 27i 
(ii) (3 + t)u = (-4 + 22t, 12 - 16t) (v) ||m|1 = 3VlO , ||i;|| = 3a/6 

(iii) 2iu + (4 - 7i)v = (-8 - 41i, -4 - 33i) 

1.52. (i) u-v = (-1 - ei, -11, -9 + 4t) (iii) u-v - 12 + 2i, vu = 12 - 2i 
(ii) (3 + t)i; = (13 + i, 31 + 17i, 27-i) (iv) ||m|| = 8, JHI = ■v/215 

1.55. (i) Sx + y-llz = -12; (ii) 13x + 4j/ + z = 7; (iii) 3» - 7i/ + 4z = 46. 

1.56. k-0 

1.57. (i) (x = T + t (ii) x = 1 + t (iii) 

y = 9 - 12* 
z = -4 + 4t 

w = 5 — t 





chapter 2 



Linear Equations 



INTRODUCTION 

The theory of linear equations plays an important and motivating role in the subject 
of linear algebra. In fact, many problems in linear algebra are equivalent to studying a 
system of linear equations, e.g. finding the kernel of a linear mapping and characterizing 
the subspace spanned by a set of vectors. Thus the techniques introduced in this chapter 
will be applicable to the more abstract treatment given later. On the other hand, some of 
the results of the abstract treatment will give us new insights into the structure of "con- 
crete" systems of linear equations. 

For simplicity, we assume that all equations in this chapter are over the real field R. We 
emphasize that the results and techniques also hold for equations over the complex field C 
or over any arbitrary field K. 

LINEAR EQUATION 

By a linear equation over the real field R, we mean an expression of the form 

aiXi + aix^ + • • • + anXn = h {!) 

where the ai, & G R and the Xi are indeterminants (or: unknowns or variables). The scalars 
ttt are called the coefficients of the Xi respectively, and b is called the constant term or simply 
constant of the equation. A set of values for the unknowns, say 

Xl = k\, Xi= ki, . . ., Xn= kn 

is a solution of (i) if the statement obtained by substituting h for Xi, 

ttifci + a2k2 + • • • + a„fc„ = b 

is true. This set of values is then said to satisfy the equation. If there is no ambiguity 
about the position of the unknowns in the equation, then we denote this solution by simply 
thew-tuple ,, , , , 

U = (fcl, fe, . . . , kn) 

Example 2.1 : Consider the equation x + 2y — 4z + w = S. 

The 4-tuple u - (3, 2, 1, 0) is a solution of the equation since 
3 + 2«2-4'l + = 3 or 3 = 3 
is a true statement. However, the 4-tuple v = (1, 2, 4, 5) is not a solution of the 
equation since 1 + 2.2-4-4 + 5 = 3 or -6 = 3 

is not a true statement. 

Solutions of the equation (1) can be easily described and obtained. There are three 
cases: 

Case (i): One of the coefficients in (1) is not zero, say ai ¥- 0. Then we can rewrite the 
equation as follows 

axx-i = b- a2X2 - • • • — anX„ or Xi = a^^b - a^^aiXi - ■ ■ • — aj" a„x„ 

18 



CHAP. 2] LINEAR EQUATIONS 19 

By arbitrarily assigning values to the unknowns X2, . . .,x„, we obtain a value for Xi; these 
values form a solution of the equation. Furthermore, every solution of the equation can 
be obtained in this way. Note in particular that the linear equation in one unknown, 

ax = h, with a¥'Q 
has the unique solution x — a^^b. 

Example 2.2: Consider the equation 2x — 4y + z — 8. 
We rewrite the equation as 

2a; = 8 + 4j/ — 2 or x = A + 2y — ^z 

Any value for y and z will yield a value for x, and the three values will be a solution 
of the equation. For example, let 2/ = 3 and z = 2; then x = 4 + 2'3 — ^-2 = 9. 
In other words, the 3-tuple u = (9, 3, 2) is a solution of the equation. 

Case (ii): All the coefficients in (1) are zero, but the constant is not zero. That is, the 
equation is of the form 

Ojci + 0*2 + • • • + Oa;„ = 6, with 6 # 

Then the equation has no solution. 

Case (ill): All the coefficients in (1) are zero, and the constant is also zero. That is, 
the equation is of the form 

Oxi + Oa;2 + • • • + Oa;„ = 

Then every n-tuple of scalars in R is a solution of the equation. 



SYSTEM OF LINEAR EQUATIONS 

We now consider a system of m linear equations in the n unknowns Xi, . . ., x„: 

ttiiXi + ai2a;2 + • • ■ + ainXn = &i 

a2ia;2 + a22X2 + • • • + a2nXn =62 

(*) 

OmlXl + am2X2 + " • • + OmnXn = &m 

where the Oij, bi belong to the real field R. The system is said to be homogeneous if the con- 
stants 61, . . . , 6m are all 0. An ?i-tuple u = (fci, . . .,kn) of real numbers is a solution (or: 
a particular solution) if it satisfies each of the equations; the set of all such solutions is 
termed the solution set or the general solution. 

The system of linear equations 

+ ai2a;2 + • • ■ + ai„Xn = 

4- n^aVa 4- ... -I- nn-f- ^ n 

(**) 



is called the homogeneous system associated with (*). The above system always has a solu- 
tion, namely the zero w-tuple = (0, 0, . . . , 0) called the zero or trivial solution. Any 
other solution, if it exists, is called a nonzero or nontrivial solution. 

The fundamental relationship between the systems (*) and (**) follows. 



anXi 
aziXi 


+ ai2X2 + ■ ■ 

+ a22X2 + • • 


• ■ + ai„a;„ = 

■ • + a2nXn - 


amlXi 


+ 0^2X2 + • 


• • + OmnXn - 



20 LINEAR EQUATIONS [CHAP. 2 

Theorem 2.1: Suppose m is a particular solution of the nonhomogeneous system (*) and 
suppose W is the general solution of the associated homogeneous system (**). 

u + W = {u + w: w G W) 

is the general solution of the nonhomogeneous system (*). 

We emphasize that the above theorem is of theoretical interest and does not help us to 
obtain explicit solutions of the system (*). This is done by the usual method of elimination 
described in the next section. 

SOLUTION OF A SYSTEM OF LINEAR EQUATIONS 

Consider the above system (*) of linear equations. We reduce it to a simpler system as 
follows: 

Step 1. Interchange equations so that the first unknown xi has a nonzero coeffi- 
cient in the first equation, that is, so that an ¥-0. 

Step 2. For each i> 1, apply the operation 

Li -* —anLi + ttiiLi 

That is, replace the ith linear equation Li by the equation obtained by mul- 
tiplying the first equation Li by -an, multiplying the ith equation L, by 
an, and then adding. 

We then obtain the following system which (Problem 2.13) is equivalent to (*), i.e. has 
the same solution set as (*): 

anaii + ai2X2 + a'laXs + • • • + a'mXn = &i 

ayja^jg + + 0,2nXn = &2 

amJ2^i2 "I" ^" dmnXn = Om 

where an ¥= 0. Here Xj^ denotes the first unknown with a nonzero coefficient in an equation 
other than the first; by Step 2, Xj^ ¥^ Xi. This process which eliminates an unknown from 
succeeding equations is known as (Gauss) elimination. 

Example 2.3: Consider the following system of linear equations: 

2x + iy - z + 2v + 2w - 1 

3x + 6y + z — V + iw = —7 

4x + 8y + z + 5v — w = 3 

We eliminate the unknown x from the second and third equations by applying the 
following operations: 

L2 ^ -3Li + 2L2 and L3 -> -2Li + Lg 

We compute -SLj: -6x - 12y + 3z - 6v - 6w = -3 

21,2: ex + 12y + 2z - 2v + Sw - -14 

-3Li + 2L2: 5z - 8v + 2w = -17 

and —21,1: -4* — 8y + 2z — 4v - 'iw = -2 

I/g: 4x + Sy + z + 5v — w = 3 

-2Li + L3: 3z + V - 5w = 1 



CHAP. 2] LINEAR EQUATIONS 21 

Thus the original system has been reduced to the following equivalent system: 

2x + iy - z + 2v + 2w = 1 

5z - 8v + 2w = -17 
32 + V — 5w = 1 

Observe that y has also been eliminated from the second and third equations. Here 
the unknown z plays the role of the unknown Xj above. 

We note that the above equations, excluding the first, form a subsystem which has 
fewer equations and fewer unknowns than the original system (*). We also note that: 

(i) if an equation Oa;i + • • • + Oa;„ = 5, b ¥=0 occurs, then the system is incon- 
sistent and has no solution; 

(ii) if an equation Oaji + • • • + Oa;„ = occurs, then the equation can be deleted 
without affecting the solution. 

Continuing the above process with each new "smaller" subsystem, we obtain by induction 
that the system (*) is either inconsistent or is reducible to an equivalent system in the 
following form 

aiiXi + ai2X2 + ttisccs + + (iinX„ = bi 

Cl'2uXj, + a2.j,+ lXi„ + 1 + + a2nXn = &2 , ^ 



Ciri^Xj^ +ar,j,+ ia;j^+i + • • • + arnX„ = br 

where 1 < ^2 < • • • < ^r and where the leading coefficients are not zero: 

an ¥ 0, a2i2 ^ ^' •■ > ^^^r ^ ^ 

(For notational convenience we use the same symbols an, bk in the system (***) as we used 
in the system (*), but clearly they may denote different scalars.) 

Definition: The above system (***) is said to be in echelon form; the unknowns Xi which 
do not appear at the beginning of any equation {iy^lyjz, . . ., jr) are termed 
free variables. 

The following theorem applies. 

Theorem 2.2: The solution of the system (***) in echelon form is as follows. There are 
two cases: 

(i) r — n. That is, there are as many equations as unknowns. Then the 
system has a unique solution. 

(ii) r <n. That is, there are fewer equations than unknowns. Then we 
can arbitrarily assign values to the n — r free variables and obtain a 
solution of the system. 

Note in particular that the above theorem implies that the system (***) and any equiv- 
alent systems are consistent. Thus if the system (*) is consistent and reduces to case (ii) 
above, then we can assign many different values to the free variables and so obtain many 
solutions of the system. The following diagram illustrates this situation. 



22 



LINEAR EQUATIONS 



fCHAP. 2 



System of linear equations 



Inconsistent 



Consistent 



No 
solution 



Unique 
solution 



More than 
one solution 



In view of Theorem 2.1, the unique solution above can only occur when the associated 
homogeneous system has only the zero solution. 

Example 2.4: We reduce the following system by applying the operations L2 -* —ZL^ + 2L2 and 
L3 -» — 3Li + 2L3, and then the operation L3 -* —SL^ + Lgi 



2x + y - 2z + 3w = 1 
Sx + 2y - z + 2w = 4 
3a; + 32/ + 3z - 3w = 5 



2x + y - 22 + 3w = 1 

2/ + 4z — 5w = 5 

Zy + 12z - 15w = 7 



2x + y - 2z + 3w = 1 

y + Az — 5w = 5 

= -8 



The equation = -8, that is, Oa; + Oi/ + Oz + Ow = -8, shows that the original 
system is inconsistent, and so has no solution. 



Example 2.5: We reduce the following system by applying the operations L^ -> 
Lg -» — 2Lt + Lg and I/4 -* — 2Li + L4, and then the operations L3 
and L4 -» -21/2 + L^: 



— Lj + L2, 
-» L2 — Ls 



X + 2y - Sz 

X + Sy + z 

2x + 5y — 4z 



4 
11 
13 



2x + ey + 2z = 22 



a; + 2]/ - 32 = 

y + Az = 

y + 2z = 

2y + 8z 



4 

7 

5 

14 



X + 2y - Sz = 4 

2/ + 4z = 7 

2z = 2 

= 



a; + 2?/ - 3z = 4 

2/ + 4z = 7 

2z = 2 

Observe first that the system is consistent since there is no equation of the form 
= 6, with 6 # 0. Furthermore, since in echelon form there are three equations 
in the three unknowns, the system has a unique solution. By the third equation, 
2 = 1. Substituting z = 1 into the second equation, we obtain y = Z. Substitut- 
ing z = 1 and y — Z into the first equation, we find x = 1. Thus a; = 1, y = Z 
and z = 1 or, in other words, the 3-tuple (1, 3, 1), is the unique solution of the 
system. 



Example 2.6: We reduce th© "following system by applying the operations L2 
L3 -» —5Li + L3, and then the operation L3 -^ — 2L2 + L^: 



X + 2y 
2x + 42/ 
5x + lOy 



2z + Zw = 2 
3z + 4w = 5 
8z + llw = 12 



X + 2y - 2z + Zw = 2 

z - 2w = 1 

2z — 4w = 2 

x + 2y -2z + Zw = 2 
z - 2w = 1 



-2Li + L2 and 

2y - 2z + Zw = 2 

z - 2w = 1 

= 



CHAP. 2] LINEAR EQUATIONS 23 

The system is consistent, and since there are more unknowns than equations in 
echelon form, the system has an infinite number of solutions. In fact, there are 
two free variables, y and w, and so a particular solution can be obtained by giving 
y and w any values. For example, let w = 1 and y = —2. Substituting w = 1 
into the second equation, we obtain z = 3. Putting w = X, z = 3 and 2/ = — 2 
into the first equation, we find a; = 9. Thus a; = 9, y — —2, z = 3 and w = 1 or, 
in other words, the 4-tuple (9, -2, 3, 1) is a particular solution of the system. 

Remark: We find the general solution of the system in the above example as follows. 
Let the free variables be assigned arbitrary values; say, y = a and w = b. 
Substituting w-h into the second equation, we obtain 2 = 1 + 26. Putting 
y = a, z = l + 2b and w = h into the first equation, we find a; = 4 - 2a + &. 
Thus the general solution of the system is 

a; = 4 — 2a + b, i/ = a, 2 = 1+ 26, w — h 

or, in other words, (4 — 2a + 6, a, 1 + 26, 6), where a and 6 are arbitrary num- 
bers. Frequently, the general solution is left in terms of the free variables y 
and w (instead of a and 6) as follows: 

X = A — 2y + w, z = 1 + 2w or (4 — 2y + w,y,l + 2w, w) 

We will investigate further the representation of the general solution of a 
system of linear equations in a later chapter. 

Example 2.7: Consider two equations in two unknowns: 

a^x + 6jj/ = Cj 
OgX + 622/ = C2 

According to our theory, exactly one of the following three cases must occur: 

(i) The system is inconsistent. 

(ii) The system is equivalent to two equations in echelon form. 

(iii) The system is equivalent to one equation in echelon form. 

When linear equations in two unknowns with real coefficients can be represented 
as lines in the plane R^, the above cases can be interpreted geometrically as follows: 

(i) The two lines are parallel. 

(ii) The two lines intersect in a unique point. 

(iii) The two lines are coincident. 



SOLUTION OF A HOMOGENEOUS SYSTEM OF LINEAR EQUATIONS 

If we begin with a homogeneous system of linear equations, then the system is clearly 
consistent since, for example, it has the zero solution = (0, 0, . . ., 0). Thus it can always 
be reduced to an equivalent homogeneous system in echelon form: 

aiiXi + ai2X2 + aisXs + + amXn = 

a2i2Xj2 + a2.J2+lXj2+l + + a2nXn = 

drj^Xj^ 1 ({■r.i^+lXj^+l + • • • + drnXn ^ 

Hence we have the two possibilities: 

(i) r = w. Then the system has only the zero solution. 
(ii) r <n. Then the system has a nonzero solution. 

If we begin with fewer equations than unknowns then, in echelon form, r <n and 
hence the system has a nonzero solution. That is, 



24 



LINEAR EQUATIONS 



[CHAP. 2 



Theorem 2.3: A homogeneous system of linear equations with more unknowns than 
equations has a nonzero solution. 

Example 2.8: The homogeneous system 

X + 2y - 3z+ w = 

x-Sy + z -2w = 
2x + y - Zz + 5w = 
has a nonzero solution since there are four unknowns but only three equations. 



Example 2.9: 



X + y — z = 
-5j/ + 3z = 



We reduce the following system to echelon form: 

X + y — z = X + y — z = 

2x-3y + z - -5y + 3z = 

x-4v + 2z - -5y + 3z = 

The system has a nonzero solution, since we obtained only two equations in the 
three unknowns in echelon form. For example, let z = 5; then y = S and x — 2. 
In other words, the 3-tuple (2, 3, 5) is a particular nonzero solution. 

Example 2.10: We reduce the following system to echelon form: 

X + y - z = X + y - z = x + y - z = 

2x + 4y - z = 2y + z = 2y + z = 

Sx + 2y + 2z = -y + 5z = llz = 

Since in echelon form there are three equations in three unknowns, the system has 
only the zero solution (0, 0, 0). 



Solved Problems 



SOLUTION OF LINEAR EQUATIONS 

2x-Sy + 6z + 2v 
2.1. Solve the system: y - ^z + v 



5w = 3 

= 1 . 
Zw - 2 



The system is in echelon form. Since the equations begin with the unknowns x,y and v re- 
spectively, the other unknowns, z and w, are the free variables. 

To find the general solution, let, say, z = a and w = 6. Substituting into the third equation, 

•y - 36 = 2 or -y = 2 + 36 

Substituting into the second equation, 

J/ - 4a + 2 + 36 = 1 01 
Substituting into the first equation, 

2x - 3(4a - 36 - 1) + 6a + 2(2 + 36) - 56 = 3 
Thus the general solution of the system is 

a; = 3a - 56 - 2, j/ = 4a - 36 - 1, z - 



y = 4a - 36 - 1 



3a - 56 



■u = 2 + 36, w = 6 



or (3a-56-2, 4a-36-l, a, 2 + 36, 6), where a and 6 are arbitrary real numbers. Some texts 
leave the general solution in terms of the free variables z and w instead of a and 6 as follows: 



CHAP. 2] LINEAR EQUATIONS 25 

X = Sz — 5w — 2 

y = 4z — 3w — 1 or (3« — 5w — 2, 4z - 3w - 1, z, 2 + 3w, w) 

V - 2 + 3w 

After finding the general solution, we can find a particular solution by substituting into the 
general solution. For example, let a = 2 and 6 = 1; then 

X = -1, J/ = 4, z = 2, v = 5, w = 1 or (-1, 4, 2, 5, 1) 

is a particular solution of the given system. 

X + 2y-3z = -1 
2.2. Solve the system: 3a; — y + 2z = 7 . 

5x + 3y - 4z = 2 

Reduce to echelon form. Eliminate x from the second and third equations by the operations 
1,2 -♦ — 3Li + L2 and Lg -* —5Li + L3: 

-3Li: -3x -6y+9z= 3 -SLj: -5x - lOy + 15z = 5 

L2: 3x- y + 2z = 7 L3: 5a: + 3^/ - 4z = 2 



-3Li + Lg: -7j/ + llz = 10 -5Li + L3: -7j/ + llz = 7 

Thus we obtain the equivalent system 

X + 2y — 3z = -1 

-7y + llz = 10 

-7y + llz = 7 

The second and third equations show that the system is inconsistent, for if we subtract we obtain 
Ox + Oy + Oz = S or = 3. 

2x+ y-2z = 10 
2.3. Solve the system: 3x + 2y + 2z = 1 . 

5a; + 42/ + 32 = 4 

Reduce to echelon form. Eliminate x from the second and third equations by the operations 
L2 -* -3Li + 21,2 and ^3 "* -5Li + 2L3: 

-31,1: -6x -3y + 6z = -30 -SLj: -lOx - 5y + lOz = -50 

21,2: 6x + 4j/ + 4z = 2 2L3: 10a; + 83/ + 6z = 8 



-3Li + 2L2: J/ + lOz = -28 -5Li + 2L3: 3y + 16z = -42 

Thus we obtain the following system from which we eliminate y from the third equation by the 
operation L^ -» — 3L2 + Lgi 

2x + y - 2z = 10 2x + y - 2z - 10 

J/ + lOz = -28 to y + lOz = -28 

3y + 16z = -42 -142 = 42 

In echelon form there are three equations in the three unknowns; hence the system has a unique 
solution. By the third equation, z = —3. Substituting into the second equation, we find j/ = 2. 
Substituting into the first equation, we obtain a; = 1. Thus x = l, y = 2 and z = -3, i.e. the 3-tuple 
(1, 2, —3), is the unique solution of the system. 



26 LINEAR EQUATIONS [CHAP. 2 

x + 2y-3z = 6 

2.4. Solve the system: 2x — y + iz = 2 . 

4x + 3y-2z = 14 

Reduce the system to echelon form. Eliminate x from the second and third equations by the 
operations Lz -* —2Li + L^ and L3 -» — 4In + L3: 

-2Li. -2x - 4j/ + 6« = -12 -4Li: -ix - 8y + 12z = -24 

L2- 2x- y+ 4:Z = 2 L3: 4* + 32/ - 2« = 14 



-5j/ + 10« = -10 -5y + lOz = -10 

or y - 2z - 2 or y - 2z = 2 



X + 2y - 3z = 6 
y - 2z = 2 



Thus the system is equivalent to 

x + 2y -Sz = 6 

y — 2z = 2 or simply 

y-2z = 2 
(Since the second and third equations are identical, we can disregard one of them.) 

In echelon form there are only two equations in the three unknowns; hence the system has an 
infinite number of solutions and, in particular, 3 — 2 = 1 free variable which is z. 

To obtain the general solution let, say, z - a. Substitute into the second equation to obtain 
J/ = 2 + 2a. Substitute into the first equation to obtain a + 2(2 + 2a) - 3o = 6 or a; = 2 — o. 
Thus the general solution is 

a; = 2 - a, y = 2 + 2a, z = a or (2 - a, 2 + 2o, o) 
where a is any real number. 
The value, say, o = 1 yields the particular solution a; = 1, 3/ = 4, « = 1 or (1,4, 1). 

x-Sy + 4:Z — 2w = 5 

2.5. Solve the system: 2y + 5z + w = 2 . 

y-Sz =4 

The system is not in echelon form since, for example, y appears as the first unknown in both 
the second and third equations. However, if we rewrite the system so that w is the second unknown, 
then we obtain the following system which is in echelon form: 

a; — 2w — 32/ + 4z = 5 

w + 2y + Bz = 2 

2/ - 3z = 4 

Now if a 4-tuple (a, 6, c, d) is given as a solution, it is not clear if 6 should be substituted for 
w or for y; hence for theoretical reasons we consider the two systems to be distinct. Of course this 
does not prohibit us from using the new system to obtain the solution of the original system. 

Let z = a. Substituting into the third equation, we find y-A + Za. Substituting into the 
second equation, we obtain w + 2(4 + 3a) + 5a = 2 or w = -6 - 11a. Substituting into the first 
equation, ^ _ ^^_^ _ ^^^^ _ 3(4 + Sa) + 4a = 5 or a; = 5 - 17o 

Thus the general solution of the original system is 

a; = 5 - 17o, J/ = 4 + 3a, z = a, w = -6 - 11a 
where o is any real number. 



CHAP. 2] UNEAR EQUATIONS 27 

2.6. Determine the values of a so that the following system in unknowns x, y and z has: 
(i) no solution, (ii) more than one solution, (iii) a unique solution: 

X + y — z — 1 

2x + Zy + az = 3 

X + ay + Sz = 2 

Reduce the system to echelon form. Eliminate x from the second and third equations by the 
operations Lj ^ — 2Li + L^ and I/3 -♦ — I/j + Lg: 



-2Li. 


-2x -2y + 2z = 
2x + 3y+ az - 


-2 
3 




—X— y + z = 
X + ay + Sz = 


-1 
2 




y + (a + 2)z = 


1 




(a-l)y + 4z = 


1 



Thus the equivalent system is 

X + y — z = 1 

J/ + (a + 2)z = 1 

(a-l)y+ 4z = 1 

Now eliminate y from the third equation by the operation L3 -^ —(a — V^L^ + L^, 
-(a - 1)1,2: -(a - 1)2/ + (2 - a - a2)z = 1- a 
La: (a-l)y + 4z = 1 



(6 - a - a^)z - 2- a 
or (3 + a)(2 - a)z = 2 - a 
to obtain the equivalent system 

X -V y — z = 1 

J/ + (a + 2)z = 1 

(3 + a)(2 - a)z = 2 - a 

which has a unique solution if the coefficient of z in the third equation is not zero, that is, if a # 2 
and a ¥= —3. In case a = 2, the third equation is = and the system has more than one solu- 
tion. In case a = —3, the third equation is = 5 and the system has no solution. 

Summarizing, we have: (i) a = —3, (ii) o = 2, (iii) a¥' 2 and o ¥= —3. 



2.7. Which condition must be placed on a, b and c so that the following system in unknowns 
X, y and z has a solution? 

X + 2y — Sz — a 

2x + ey-llz - b 

X — 2y + Iz — c 

Reduce to echelon form. Eliminating x from the second and third equation by the operations 
1/2 -^ — 2Li + L2 and I/3 -» — Lj + L3, we obtain the equivalent system 

X + 2y — Sz = a 

2y — 5z = b — 2a 

— 4j/ 4- lOz = c — a 

Eliminating y from the third equation by the operation Lg -* 2L2 + L3, we finally obtain the 
equivalent system 

X + 2y — 3z — a 

2y — 5z = b — 2a 

= c + 26 - 5o 



28 LINEAR EQUATIONS [CHAP. 2 

The system will have no solution if the third equation is of the form = fe, with k ^ 0; that is, 
if c + 2b — 5a ¥= 0. Thus the system will have at least one solution if 

c + 26 - 5a = or 5a = 26 + c 
Note, in this case, that the system will have more than one solution. In other words, the system 
cannot have a unique solution. 

HOMOGENEOUS SYSTEMS OF LINEAR EQUATIONS 

2.8. Determine whether each system has a nonzero solution: , o , _ , - n 

x-2y + Zz-2w = x + 2y-3z = 2x + 5y + 2z = 

3x-ly-2z + 4kw ^ 2x + 5y + 2z = x + Ay + 7z = 

Ax + Sy + 5z + 2w = Sx- y-4z = x+3y + Sz = 

(i) (ii) ("i) 

(i) The system must have a nonzero solution since there are more unknowns than equations. 

(ii) Reduce to echelon form: 

x + 2y-Zz = x + 2y-Sz = x + 2y - 3z = 

2x + 5y + 2z = to y + 8z = to y + 8z = 

3x- y-4z = -7y + 5z = 61z = 

In echelon form there are exactly three equations in the three unknowns; hence the system has 

a unique solution, the zero solution. 

(iii) Reduce to echelon form: 

X + 2y - z = X + 2y - z = 

2x + 5y + 2z = y + 4z = x + 2y - z = 

a; + % + 7z = 2y + 8z = y + Az = 

x + 3y + Sz = j/ + 4z = 

In echelon form there are only two equations in the three unknowns; hence the system has a 
nonzero solution. 

2.9. The vectors Ui,...,Um in, say, R" are said to be linearly dependent, or simply 
dependent, if there exist scalars ki,...,km, not all of them zero, such that 
kiui + ■■• + kmiim = 0. Otherwise they are said to be independent. Determme 
whether the vectors u, v and w are dependent or independent where: 

(i) u = (1, 1, -1), V = (2, -3, 1), w = (8, -7, 1) 

(ii) u = (1, -2, -3), V = (2, 3, -1), w = (3, 2, 1) 

(iii) u = (ai, a2), v = (bi, 62), w = (ci, C2) 

In each case: 
(a) let XU + yv + zw = where x, y and z are unknown scalars; 
(6) find the equivalent homogeneous system of equations; 
(c) determine whether the system has a nonzero solution. If the system does, then the vectors are 

dependent; if the system does not, then they are independent. 

(i) Let XU + yv + zw = 0: 

x{l, 1, -1) + j/(2, -3, 1) + z{8, -7, 1) = (0, 0, 0) 

or (a;, a;, -x) + (2j/, -3y, y) + (8«, -7z, z) = (0, 0, 0) 

or (x + 2y + 8z, x-Sy -Iz, -x + y + z) = (0, 0, 0) 



CHAP. 2] LINEAR EQUATIONS 29 

Set corresponding components equal to each other and reduce the system to echelon form: 

X + 2y + 8z = X + 2y + Sz = x + 2y + Sz = x + 2y + 8z - 

x-Zy -7z = ~5y - 15z = y + 3z = y + 3z = 

-X + y + z = Q 3y + 9z = y + Sz = 

In echelon form there are only two equations in the three unknowns; hence the system has a 
nonzero solution. Accordingly, the vectors are dependent. 

Remark: We need not solve the system to determine dependence or independence; we only need 
to know if a nonzero solution exists. 

(ii) x(l, -2, -3) + 3/(2, 3, -1) + z{3, 2, 1) = (0, 0, 0) 

(x, -2x, -3x) + (2y, 3y, -y) + (3z, 2z, z) = (0, 0, 0) 
(x + 2y + 3z, -2x + 3y + 2z, -3x - y + z) = (0, 0, 0) 

x + 2y + 3z = x + 2y + 3z - Q x + 2y + 3z = 

-2x + 3y + 2z = 7y + Sz = 7y + 8z = 

-3x - y + z = 5y + lOz = 30z = 

In echelon form there are exactly three equations in the three unknowns; hence the system has 
only the zero solution. Accordingly, the vectors are independent. 

(iii) x(,ai, 02) + y{bi, 62) + z(ci, C2) = (0, 0) 

{ttix, a2x) + {byy, h^y) + (c^z, c^z) = (0, 0) and so "''*' '^ "'^ ~ 

02* + 62J/ + C2Z = 

(dja; + 61J/ + CiZ, a2X + b^y + C2Z) = (0, 0) 

The system has a nonzero solution by Theorem 2.3, i.e. because there are more unknowns than 
equations; hence the vectors are dependent. In other words, we have proven that any three 
vectors in R2 are dependent. 



2.10. Suppose in a homogeneous system of linear equations the coefficients of one of the 
unknowns are all zero. Show that the system has a nonzero solution. 

Suppose «!, ...,«„ are the unknowns of the system, and Xj is the unknown whose coefficients 
are all zero. Then each equation of the system is of the form 

«i^i + • • • + ttj-i^j-i + Oajj + aj + i^j + i + • • • + o„a;„ = 

Then for example (0, . . .,0, 1, 0, . . .,0), where 1 is the ;th component, is a nonzero solution of each 
equation and hence of the system. 



MISCELLANEOUS PROBLEMS 

2.11. Prove Theorem 2.1: Suppose m is a particular solution of the homogeneous system 

(*) and suppose W is the general solution of the associated homogeneous system (**). 

Then 

u + W = {u + w: w G W} 

is the general solution of the nonhomogeneous system (*). 

Let U denote the general solution of the nonhomogeneous system (*). Suppose uG U and that 
u = (% Un). Since m is a solution of (*), we have for t = 1, . . . , m, 

Now suppose w 6 W and that w = (w^, . . ., w„). Since w is a solution of the homogeneous system 
(**), we have for i = 1, . . .,m, 

OiiWi + aj2W2 + • ■ • + fflin^^n = 



30 LINEAR EQUATIONS [CHAP. 2 

Therefore, for i = l, ...,nt, 

0,i(Mi + Wi) + Oi2(M2 + W2) + • • • + ai„{Un + W„) 

= OjiMi + OjiWi + aj2M2 + «t2"'2 + ' " ' + linMn + «inWn 

= (OilMl + ai2M2 + • • • + «tnO + ("il^l + «i2W2 + ' " ' + «in«'n) 

= 6i + = 6j 

That is, M + w is a solution of (*). Thus u + w e^U, and hence 

u + W c U 

Now suppose V = (vi, . . . , vj is any arbitrary element of U, i.e. solution of (*). Then, for 

t = 1, . . .,w, 

ttji^i + ai2-U2 + • • • + ai„v„ = bj 

Observe that v = u+(v — u). We claim that v-uGW. For i = 1, . . . , m, 

ail(i;i — Ml) + ai2(t'2 — M2) + " * ' + »m(^n ~ "n) 

= (OjlVl + aj2'y2 + • • ■ + ftin^n) ~ («il"l + «t2"2 + • • • + «{„«■„) 

= 6i - 6i = 

Thus V - M is a solution of the homogeneous system (*), i.e. ■u-mST^. Then vGu+W, and hence 

U Q u + W 

Both inclusion relations give us U - u + W; that is, u+W is the general solution of the 
nonhomogeneous system (**). 

2.12. Consider the system (*) of linear equations (page 18). Multiplying the ith equation 
by Ci, and adding, we obtain the equation 

(CiOn + • • • + Cmaml)Xl + • • • + (Cittm + • • • + CmOmn)*™ = Ci5i + • • • + Cmbm (1) 

Such an equation is termed a linear combination of the equations in (*). Show that 
any solution of (♦) is also a solution of the linear combination (1). 
Suppose M = (fci, . . . , fcj is a solution of (*). Then 

ffliifci + aah + • • • + «{«*:„ = 6i, t = 1, . . .,m (2) 

To show that tt is a solution of (1), we must verify the equation 

(Ciaii + • • • + Cmd^Ofcl + • • • + ("ifflln + • • • + emO'mn)K = «!&! + " " " + C^fe^ 

But this can be rearranged into 

Ci(anfel + • • • + 0,lnK) + • • • + Cm(aml + • • • + amn'«n) = Ci^l + ' " " + C^fem 

or, by (2), Ci6i + • • • + cj)„^ = Cjbi + • • • + c^b^ 



which is clearly a true statement. 

2.13. In the system (*) of linear equations, suppose an ¥= 0. Let (#) be the system ob- 
tained from (*) by the operation U ^ -anLi + auU, i^l. Show that (*) and (#) 
are equivalent systems, i.e. have the same solution set. 

In view of the above operation on (*), each equation in (#) is a linear combination of equations 
in (*); hence by the preceding problem any solution of (*) is also a solution of (#). 

On the other hand, applying the operation !/{ -* — (-Oii^^i + i-O to (#), we obtain the origi- 
nal system (*). That is, each equation in (*) is a linear combination of equations in (#); hence each 
solution of (#) is also a solution of (*). 

Both conditions show that (*) and (#) have the same solution set. 



CHAP. 2] LINEAR EQUATIONS 31 

2.14. Prove Theorem 2.2: Consider a system in echelon form: 

aiia;i + ai^Xi + aizXz + + ainXn = bi 

a2J2^J2 + a2,J2+l«J2 + l + + ainXn = 62 

O'ri^^ir + <*r,]V+ia;j^+l + • • • + arnaJn = for 

where 1<h< ■ ■ • < jr and where an ^ 0, a2J2 ^0, . . . , ari, ^ 0, The solution is as 
follows. There are two cases: 

(i) r = n. Then the system has a unique solution. 

(ii) r <n. Then we can arbitrarily assign values to the n — r free variables and 
obtain a solution of the system. 

The proof is by induction on the number r of equations in the system. If r = 1, then we have 
the single linear equation 

aiXi + a^Xi + a^x^ + • • • + a„a;„ = 6, where Oj #• 

The free variables are x^, ■ ■ ., a;„. Let us arbitrarily assign values to the free variables; say, 
«2 = Iti, xs — k^, ...,«„ = fe„. Substituting into the equation and solving for Xi, 

Xi = — (6 - fflzfca - asks - ■■■ - o„fc„) 
"■1 

These values constitute a solution of the equation; for, on substituting, we obtain 

"i — (* "" 02*^2 — • • • — a„k„) + ajt^ + • • • + a„fc„ = 6 or 6 = 6 

which is a true statement. 

Furthermore if r = % = 1, then we have ax = b, where a # 0. Note that x = b/a is a solu- 
tion since a(b/a) = 5 is true. Moreover if x = k is a solution, i.e. ak = b, then k = b/a. Thus 
the equation has a unique solution as claimed. 

Now assume r > 1 and that the theorem is true for a system of r — 1 equations. We view the 
r — 1 equations 

'*2J2*J2 + '*2,J2+1*J2 + 1 "^ "^ "2na;n = *2 

as a system in the unknowns Xj^ «„. Note that the system is in echelon form. By induction 

we can arbitrarily assign values to the (n — J2 + 1) — (r — 1) free variables in the reduced system 
to obtain a solution (say, Xj^ — fcj^, . . . , x„ — &„). As in case r = 1, these values and arbitrary 
values for the additional J2 — 2 free variables (say, a;2 = ^2, • ■ •, a^j,-! = 'fja-i)' yield a solution 
of the first equation with 

Xi = — (6j - 012^2 - • • • - ai„k„) 

Oil 

(Note that there are (n — J2 + 1) — (r — 1) + (jg — 2) = n — r free variables.) Furthermore, these 
values for Xi, . . .,x„ also satisfy the other equations since, in these equations, the coefficients of 
«!,..., »j„-i are zero. 

Now if r = n, then 32 = 2. Thus by induction we obtain a unique solution of the subsystem 
and then a unique solution of the entire system. Accordingly, the theorem is proven. 

2.15. A system (*) of linear equations is defined to be consistent if no linear combination 
of its equations is the equation 

Oa;i + 0*2 + • • • + Oa;„ = b, where b ¥- (I) 

Show that the system (*) is consistent if and only if it is reducible to echelon form. 

Suppose (*) is reducible to echelon form. Then it has a solution which, by Problem 2.12, is a 
solution of every linear combination of its equations. Since (1) has no solution, it cannot be a linear 
combination of the equations in (*). That is, (*) is consistent. 

On the other hand, suppose (*) is not reducible to echelon form. Then, in the reduction process, 
it must yield an equation of the form (1). That is, (J) is a linear combination of the equations in (*). 
Accordingly (*) is not consistent, i.e. (*) is inconsistent. 



32 



LINEAR EQUATIONS 



[CHAP. 2 



Supplementary Problems 



SOLUTION OF LINEAR EQUATIONS 

2x + Sy = 1 
5x + 7y = 3 



2.16. Solve: 



(i) 



(ii) 



2x + 4y = 10 
3« + 6y = 15 



(iii) 



Ax -2y = 5 
-6x + 3y = 1 



2.17. Solve: 

2x + y - Sz = 5 

(i) Sx -2y + 2z = 5 

5x -Sy - z = 16 



2x + 3y -2z = 5 

(ii) « - 21/ + 3« = 2 

4a; — 1/ + 4z = 1 



X + 2y + 3« = 3 

(iii) 2x + 3y+ 8« = 4 

3a; + 2j/ + 17? = 1 



2.18. Solve: 

(i) 



2x + 3y = 3 x + 2y-3z + 2w - 2 

X- 2y - 5 (ii) 2x + 5y - 8z + Gw - 5 

3x + 2y = 7 3x + Ay - 5z + 2w = 4 



X + 2y - z + 3w = 3 

(iii) 2x + Ay + 4z + 3w = 9 

3x + 6j/ - 2 + 8w = 10 



2.19. Solve: 



(i) 



X + 2y + 2z = 2 

3x -2y - z = 5 

2x - 5y + 3z = -A 

X + Ay + 6z — 



X + 5y + Az - 13w = 3 

(ii) 3a; - y + 2z + 5w = 2 

2x + 2y + 3z - Aw = 1 



2.20. Determine the values of fc such that the system in unknowns x, y and z has: (i) a unique solution, 
(ii) no solution, (iii) more than one solution: 

kx + y + z = 1 

(o) x + ky + z = l (6) 

X + y + kz = 1 



X + 2y + kz = 1 
2a; + fc« + 8« = 3 



2.21. Determine the values of k such that the system in unknowns x, y and z has: (i) a unique solution, 
(ii) no solution, (iii) more than one solution: 

X + y + kz - 2 X - 3z = -3 

(a) 3x + Ay + 2z-k (6) 2x + ky - z = -2 

2x + 3y — z = 1 X + 2y + kz - 1 

2.22. Determine the condition on a, b and c so that the system in unknowns x, y and z has a solution: 

X + 2y — 3z = a x — 2y + Az = a 

(i) 3x- y + 2z - b (ii) 2x + Sy - z - b 

X — 5y + Bz = e 3x + y + 2z = e 



HOMOGENEOUS SYSTEMS 

2.23. Determine whether each system has a nonzero solution: 

x + 3y -2z = X + 3y -2z = 

(i) x-8y + 8z - (ii) 2x - 3y + z = 

3x-2y + Az = 3x - 2y + 2z = 



X + 2y — 5z + Aw = 
(iii) 2x - 3y + 2z + 3w = 

4x — 7j/ + z — 6w — 



CHAP. 2] LINEAR EQUATIONS 33 

2.24. Determine whether each system has a nonzero solution: 

X- 2y + 2z = 2x - 4y + 7z + 4v - 5w - 

2x+ y - 2z = 9x + 3y + 2z ^ 7v + w = 

(i) (ii) 

3x+ 4y- 6z = 5x + 2y - 3z + v + 3w - 

3x - lly + 12z = 6x - 5y + 4z- 3v — 2w - Q 



2.25. Determine whether the vectors u, v and w are dependent or independent (see Problem 2.9) where: 
(i) u = (1, 3, -1), V = (2, 0, 1), w = (1, -1, 1) 

(ii) u = (1, 1, -1), V = (2, 1, 0), w = (-1, 1, 2) 

(iii) u = (1, -2, 3, 1), V = (3, 2, 1, -2), w = (1, 6, -5, -4) 

MISCELLANEOUS PROBLEMS 

2.26. Consider two general linear equations in two unknowns x and y over the real field K: 

ax + by = e 

ex + dy = f 
Show that: 

(i) it -¥'2, i.e. if ad - 6c ¥= 0, then the system has the unique solution x = ^^ ~ ^f 
, ad — he 

_ af — ee 

^ ~ ad-bc' 
(ii) i* 7 = J '^ 7. tlien the system has no solution; 
(iii) ii — = 2 = -f> then the system has more than one solution. 

2.27. Consider the system ax + by = 1 

ex + dy = 

Show that if ad-be¥'0, then the system has the imique solution x = d/(ad — be), y = —e/{ad - be). 
Also show that if ad—be = 0,e¥'0 or d ^ 0, then the system has no solution. 



2.28. Show that an equation of the form Oki + Oa;2 + * • • + Oa;„ = may be added or deleted from a 
system without affecting the solution set. 



2.29. Consider a system of linear equations with the same number of equations as unknowns: 

fflii*! + ai2«2 + • • • + ai„x„ = 61 

a^xi + a22«2 + • • • + a2„x„ = 62 
(i) 

Onl*! + 01.2*2 + • ■ • + «„„«„ = 6„ 

(i) Suppose the associated homogeneous system has only the zero solution. Show that (i) has a 
unique solution for every choice of constants 6j. 

(ii) Suppose the associated homogeneous system has a nonzero solution. Show that there are 
constants 64 for which {!) does not have a solution. Also show that if {1) has a solution, then 
it has more than one. 



34 



LINEAR EQUATIONS [CHAP. 2 



2z — 2w 
a; = 7/2 - 5w/2 - 2j/ 



Answers to Supplementary Problems 

2.16. (i) X = 2, y = -1; (ii) a; = 5 - 2a, j/ = a; (iii) no solution 

rx = -1 - 7z 

2.17. (i) (1,-3,-2); (ii) no solution; (iii) {-1 - 7a, 2 + 2a, a) or |^ ^ g + 2z 

2.18. (i) a; = 3, 1/ = -1 

ra; = — z + 2w 

(ii) (-a + 26, 1 + 2a - 26, a, 6) o*" ] ^ + 

(iii) (7/2 - 56/2 - 2a, a, 1/2 + 6/2, 6) or |^ ^ ^^^ + w/2 

2.19. (i) (2, 1, -1); (ii) no solution 

2.20. (a) (i) k¥'l and fe # -2; (ii) fc = -2; (iii) fe = 1 
(6) (i) never has a unique solution; (ii) k — 4; (iii) fe t^ 4 

2.21. (a) (i) fc # 3; (ii) always has a solution; (iii) fe = 3 
(6) (i) fc ^ 2 and A; # -5; (ii) fc = -5; (iii) fe = 2 

2.22. (i) 2a - 6 + c = 0. (ii) Any values for a, b and c yields a solution. 

2.23. (i) yes; (ii) no; (iii) yes, by Theorem 2.3. 

2.24. (i) yes; (ii) yes, by Theorem 2.3. 

2.25. (i) dependent; (ii) independent; (iii) dependent 



chapter 3 



Matrices 

INTRODUCTION 

In working with a system of linear equations, only the coefficients and their respective 
positions are important. Also, in reducing the system to echelon form, it is essential to 
keep the equations carefully aligned. Thus these coefficients can be efficiently arranged in 
a rectangular array called a "matrix". Moreover, certain abstract objects introduced in 
later chapters, such as "change of basis", "linear operator" and "bilinear form", can also 
be represented by these rectangular arrays, i.e. matrices. 

In this chapter, we will study these matrices and certain algebraic operations defined on 
them. The material introduced here is mainly computational. However, as with linear 
equations, the abstract treatment presented later on will give us new insight into the 
structure of these matrices. 

Unless otherwise stated, all the "entries" in our matrices shall come from some arbitrary, 
but fixed, field K. (See Appendix B.) The elements of K are called scalars. Nothing essen- 
tial is lost if the reader assumes that K is the real field R or the complex field C. 

Lastly, we remark that the elements of R" or C" are conveniently represented by "row 
vectors" or "column vectors", which are special cases of matrices. 



MATRICES 

Let K be an arbitrary field. A rectangular array of the form 

0,12 . . . din 

0,22 . . . 0,2n 



\Q,ml Om2 ... fflr 

where the Odi are scalars in K, is called a matrix over K, or simply a matrix if K is implicit. 
The above matrix is also denoted by (ohj), i = l, . . .,m, j = 1, . . .,n, or simply by (a«). 
The m horizontal «-tuples 

(ail, ai2, . . . , ttln), (tt21, 0^22, . . . , a2n), . . ., {ami, am2, . . . , Omn) 

are the rows of the matrix, and the n vertical w-tuples 

lai2X 

a22 

, • • . f 

\am2l 

are its columns. Note that the element ay, called the ij-entry or ij-component, appears in 
the ith row and the yth column. A matrix with m rows and n columns is called an w by « 
matrix, or m x n matrix; the pair of numbers (m, n) is called its size or shape. 

35 





36 MATRICES [CHAP. 3 

/I -3 4\ 
Example 3.1: The following is a 2 X 3 matrix: ( r _c, ) • 

Its rows are (1, —3, 4) and (0, 5, —2); its columns are ( « ) . ( r ) and I j . 

Matrices will usually be denoted by capital letters A,B, . . ., and the elements of the 
field K by lower case letters a,b, . . . . Two matrices A and B are equal, written A = B, if 
they have the same shape and if corresponding elements are equal. Thus the equality of 
two mxn matrices is equivalent to a system of mn equalities, one for each pair of elements. 

Example 3.2: The statement ( ^ " '")=(, .) is equivalent to the following system 

. .. \x-y z-wj VI 4/ 

of equations: 

x + y = 3 

X — y = I 

2z + w = 5 

z — w = 4 

The solution of the system is x = 2, y = 1, z = 3, w = —1. 

Remark: A matrix with one row is also referred to as a row vector, and with one column 
as a column vector. In particular, an element in the field K can be viewed as 
a 1 X 1 matrix. 



MATRIX ADDITION AND SCALAR MULTIPLICATION 

Let A and B be two matrices with the same size, i.e. the same number of rows and of 
columns, say, mxn matrices: 

(an ai2 ... ain \ / ^u ^^2 ... bin 

a21 022 ... CLin I . g _ I ^Hi ^22 ... ban 

Oml ami . . . ffimn / \ &ml 6m2 . . . &mti 

The sum of A and B, written A + J?, is the matrix obtained by adding corresponding entries: 



an + &n ai2 + &12 ... am + bin 

„ 1 a21 + &2I a22 + 622 ... a2n + ?>2n 

A + B = 

ami + bml Omi + &m2 . . • Omn + 

The product of a scalar k by the matrix A, written fc • A or simply kA, is the matrix obtained 
by multiplying each entry of A by k: 

Ckaii kai2 ■ ■ ■ feain 
fca2i ka22 . . . kazn 
kaml fcOm2 . • . kOmn I 

Observe that A+B and kA are also mxn matrices. We also define 

-A = -1-A and A - B ^ A + {-B) 

The sum of matrices with different sizes is not defined. 



CHAP. 3] 



MATRICES 



37 



Example 3.3: Let A = (] ^ J\ and B = fj ° ^). Then 



A + B 



3A = 



1 + 3-2 + 3 + 2 

4-7 5 + 1-6 + 8 

3*1 3 • (-2) 3 • 3 
3 '4 3-5 3 '(-6) 



-c 



2A-SB = r -' ') + r ° "' 
'8 10 -12/ V2I -3 -24 



4 -2 
-3 6 

3-6 9 
12 15 -18 



-7 
29 



-4 

7 




-36 



Example 3.4: The mXn matrix whose entries are all zero, 

10 ... 
... 



,0 ... 0, 

is called the zero matrix and will be denoted by 0. It is similar to the scalar in 
that, for any mXn matrix A = (a^), A + = (a^ + 0) = (Oy) = A. 

Basic properties of matrices under the operations of matrix addition and scalar multi- 
plication follow. 

Theorem 3.1 : Let F be the set of all m x n matrices over a field K. Then for any matrices 
A,B,C GV and any scalars ki, kz € K, 

(i) (A+B) + C = A + {B + C) (v) k,{A + B) = kiA + kiB 

(ii) A + = A (vi) {ki + fe)^ = kiA + k^A 

(iii) A + (-A) = (vii) (kiki)A = kiik^A) 

(iv) A+B = B + A (viii) 1- A = A and OA = 

Using (vi) and (viii) above, we also have that A + A = 2A,A + A + A = ZA, ... . 
Remark: Suppose vectors in R" are represented by row vectors (or by column vectors); 



say, 



u — (ai, Oi, . . . , ttn) and v = (bi, 62, . . . , b„) 



Then viewed as matrices, the sum u + v and the scalar product ku are as follows: 

u + v - {ai + bi,a2 + b2,...,an + b„) and ku = (fcai, kaz, ..., A;a„) 

But this corresponds precisely to the sum and scalar product as defined in 
Chapter 1. In other words, the above operations on matrices may be viewed 
as a generalization of the corresponding operations defined in Chapter 1. 



MATRIX MULTIPLICATION 

The product of matrices A and B, written AB, is somewhat complicated. For this 
reason, we include the following introductory remarks. 

(i) Let A = (Oi) and B = (bi) belong to R", and A represented by a row vector and B by a 
column vector. Then their dot product A • B may be found by combining the matrices 
as follows: 

lbl\ 
A-B = (tti, 02, . . .,a„) ( M = aibi + a2b2 + ■ ■ • + ttnbn 

Wl 

Accordingly, we define the matrix product of a row vector A by a column vector B as 
above. 



38 MATRICES [CHAP. 3 

bnXi + biiXi + feisics = y\ 
(ii) Consider the equations (1) 

h2lXl + b22X2 + b23X3 = 1/2 

This system is equivalent to the matrix equation 

6n b. b.s\h\ ^ fyA ^^^.^pjy ^^ ^ ^ 

&21 &22 &23/U3/ V^V 

where B — (&«), X = (x,) and Y — (yi), if we combine the matrix B and the column 
vector X as follows: 

Dv- _ ( "" "'- ""!( 1 _ /feiiaJi + &i2a;2 + bisa^sN _ fBi'X 

\b2iXl + b22X2 + b2SXs J \B2-X 

where Bi and B2 are the rows of B. Note that the product of a matrix and a column 
vector yields another column vector. 

auVi + ai22/2 = zi 
(iii) Now consider the equations (2) 

a2iyi + (i22y2 = Z2 

which we can represent, as above, by the matrix equation 

^aii ai2\/yi\ ( Zx 




, , , , , or simply A Y = Z 

^21 022/^2/2/ y22 

where A = (Cij), Y = {yi) as above, and Z = (z^. Substituting the values of y\ and 1/2 
of (i) into the equations of {2), we obtain 

aii(&iia;i + 6i2a;2 + b\%x%) + ai2(62ia;i + 622332 + &23a:;3) = «i 

a2i(&iia;i + &i2a;2 + bisXs) + a22(&2ia;i + &22a;2 + btzx^) = 22 

or, on rearranging terms, 

(ttubii + ai2&2i)a;i + (aii6i2 + ai2&22)a;2 + (an&is + a\2b23)Xz = Zi 

(3) 
(azi&u + 022&2i)a;i + («2i&i2 + a22&22)a;2 + (021613 + 022623)033 = 22 

On the other hand, using the matrix equation BX = Y and substituting for Y into 
AY = Z, we obtain the expression 

ABX = Z 

This will represent the system (3) if we define the product of A and B as follows: 

ftii ffli2\/6n 612 bisX _ /aii6ii + 012621 011612 + 012622 011613 + 012623 



021 022/1621 622 623/ YO21611 + 022621 021612 + 022622 021613 + O22623 

Ai-B' ArB^ Ai-B^ 
A2-jB' Aa-B^ A2'B^ 

where Ai and A2 are the rows of A and J?S B^ and B^ are the columns of B. We em- 
phasize that if these computations are done in general, then the main requirement is 
that the number of yi in (1) and (2) must be the same. This will then correspond to the 
fact that the number of columns of the matrix A must equal the number of rows of 
the matrix B. 



CHAP. 3] 



MATRICES 



39 



With the above introduction, we now formally define matrix multiplication. 

Definition: Suppose A = (a«) and B = (&«) are matrices such that the number of columns 
of A is equal to the number of rows of B; say, A is an m x p matrix and B is a 
pxn matrix. Then the product AB is the mxn matrix whose y-entry is 
obtained by multiplying the ith row A, of A by the yth column B' of B: 



That is, 
jail 



dml 



AB = 




Ai-fii 


Ai-B2 . 


. . Ai-5" 


A2-S1 


A2-B2 . 


. . A2-B" 



,A„-Bi Am-B^ 



where cy = aiiftij + ai2&23 + • • • + avpbp. 



Opn 
P 

1 

fc = l 



Am'B"! 



/Cii ... Cm 
Cii 

\Cml ... Ci 



= 2 Cifc&fci- 



We emphasize that the product AB is not defined if A is an m x p matrix and B is a 
qxn matrix, where p ^ q. 



Example 3.5: 



r s 
t u 



<»i (H "3 
6i 62 ^3 



raj + s6i ra2 + 562 ^'is + S63 
toi + m6i (02 + M^2 *"3 + '^^s 



Example 3.6: 



1 2 

3 4 

1 1 

2 



1 1 

2 

1 2 
3 4 



1-1 + 2-0 I'l + 2-2 

3-1 + 4-0 3«l + 4'2 

1-1 + 1-3 1-2 + 1'4 

0'l + 2-3 0*2 + 2*4 



1 5 

3 11 

4 6 
6 8 



The above example shows that matrix multiplication is not commutative, i.e. the products 
AB and BA of matrices need not be equal. 

Matrix multiplication does, however, satisfy the following properties: 

Theorem 3.2: (i) iAB)C = A{BC), (associative law) 

(ii) A{B + C) = AB + AC, (left distributive law) 
(iii) (B + C)A = BA + CA, (right distributive law) 
(iv) k{AB) = {kA)B = A{kB), where A; is a scalar 

We assume that the sums and products in the above theorem are defined. 

We remark that OA = and BO =^ where is the zero matrix. 



TRANSPOSE 

The transpose of a matrix A, written A*, is the matrix obtained by writing the rows of 
A, in order, as columns: 

/ttii 0.12 . . . Oln \ ' /ttli 0.21 . . . aTOl\ 

0,21 ffi22 . . . 02n \ / O12 ffl22 . . . Om2 



^Oml Om2 . . . Omni \Oin 02n . ■ . OmnJ 

Observe that if A is an m x « matrix, then A' is an w x m matrix. 



40 MATRICES [CHAP. 3 

/l 4^ 
Example 3.7: (J J _IJ = (2 -5^ 

The transpose operation on matrices satisfies the following properties: 
Theorem 3.3: (i) (A+B)* = A* + B* 
(ii) (A')' = A 

(iii) {kAy — kA\ for k a scalar 
(iv) {ABy = B«A« 

MATRICES AND SYSTEMS OF LINEAR EQUATIONS 

The following system of linear equations 

anXi + ai2X2 + • • • + aina;n = &i 

a2iXi + a22X2 + • • • + annXn =62 n\ 



OmlXi + am2X2 + • • • + OmnXn 

is equivalent to the matrix equation 

/an ai2 
a2i 022 ... a2n \IX2\ ^ lb2 \ or simply AX = B (2) 



lOml fflm2 




where A = (an), X = {Xi) and B = (&i). That is, every solution of the system {1) is a 
solution of the matrix equation (2), and vice versa. Observe that the associated homogeneous 
system of (1) is then equivalent to the matrix equation AX = 0. 

The above matrix A is called the coefficient matrix of the system (1), and the matrix 

/(111 O12 . . • ttin 

tt21 tt22 • • • (^2n 

^ttml (lm2 • . . Otnn 

is called the augmented matrix of (1). Observe that the system (1) is completely determined 
by its augmented matrix. 

Example 3.8: The coefficient matrix and the augmented matrix of the system 

2a; + 3j/ - 4z = 7 
x-2y- 5z - 3 

are respectively the following matrices: 

/2 3 -4\ /2 3-4 7 

(1 _2 -5; ^*^ \l -2 -5 3 

Observe that the system is equivalent to the matrix equation 

'2 3 



1 -2 




X\ ,rj 



In studying linear equations it is usually simpler to use the language and theory of 
matrices, as indicated by the following theorems. 



CHAP. 3] MATRICES 41 

Theorem 3.4: Suppose Ui,U2, . . .,tin are solutions of a homogeneous system of linear 
equations AX = 0. Then every linear combination of the m of the form 
kiUi + kiUz + • • • + krOin where the fe are scalars, is also a solution of 
AX = 0. Thus, in particular, every multiple ku of any solution u of 
AX = is also a solution of AX = 0. 

Proof. We are given that Aui — 0, Au2 = 0, . . . , Aun = 0. Hence 

A{kui + kui + • • • + fettn) = kiAui + kiAu^ + • • • + knAun 

= fciO + ^20 + • • • + fc„0 = 
Accordingly, kiUi + • • • + k„iia is a solution of the homogeneous system AX = 0. 

Theorem 3.5: Suppose the field K is infinite (e.g. if K is the real field R or the complex 
field C). Then the system AX = B has no solution, a unique solution or 
an infinite number of solutions. 

Proof. It suffices to show that if AX = B has more than one solution, then it has 
infinitely many. Suppose u and v are distinct solutions of AX = B; that is, Au = B and 
Av — B. Then, for any k GK, 

A{u + k{u-v)) = Au + k{Au-Av) = B + k(B-B) = B 

In other words, for each k e K, u + k(u-v) is a. solution of AX = B. Since all such solu- 
tions are distinct (Problem 3,31), AX = B has an infinite number of solutions as 
claimed. 

ECHELON MATRICES 

A matrix A = (an) is an echelon matrix, or is said to be in echelon form, if the number 
of zeros preceding the first nonzero entry of a row increases row by row until only zero 
rows remain; that is, if there exist nonzero entries 

aiii, '^^h' • •' "'^'r' where ^i < ^2 < • • • < jr 

with the property that 

aij = for i^r, j < ji, and for i>r 

We call ttijj, . . . , ttrj,. the distinguished elements of the echelon matrix A. 

Example 3.9: The following are echelon matrices where the distinguished elements have been 
circled: 

/(i) 3 2 4 5 -6\ 

1-3 2 

2 

0/ 

In particular, an echelon matrix is called a row reduced echelon matrix if the dis- 
tinguished elements are: 

(i) the only nonzero entries in their respective columns; 

(ii) each equal to 1. 
The third matrix above is an example of a row reduced echelon matrix, the other two are 
not. Note that the zero matrix 0, for any number of rows or of columns, is also a row 
reduced echelon matrix. 

ROW EQUIVALENCE AND ELEMENTARY ROW OPERATIONS 

A matrix A is said to be row equivalent to a matrix B if B can be obtained from A by a 
finite sequence of the following operations called elementary row operations: 





42 MATRICES [CHAP. 3 

[Et]: Interchange the ith row and the yth row: Rt <^ Rj. 

[E2]: Multiply the ith row by a nonzero scalar k: Ri -» kR,, fc v^ 0. 

[Es]: Replace the ith row by k times the jth row plus the ith row: Ri -* kRj + R,. 

In actual practice we apply [£^2] and then [£"3] in one step, i.e. the operation 
[E]: Replace the ith row by fe' times the jth row plus k (nonzero) times the ith row: 
Ri -* k'Rj + kRi, k^-O. 

The reader no doubt recognizes the similarity of the above operations and those used 
in solving systems of linear equations. In fact, two systems with row equivalent aug- 
mented matrices have the same solution set (Problem 3.71). The following algorithm is 
also similar to the one used with linear equations (page 20). 

Algorithm which row reduces a matrix to echelon form: 

Step 1. Suppose the ji column is the first column with a nonzero entry. Inter- 
change the rows so that this nonzero entry appears in the first row, that is, 
so that ttijj ¥- 0. 

Step 2. For each i > 1, apply the operation 

Ri -* —ttij^Ri + aijjiJt 

Repeat Steps 1 and 2 with the submatrix formed by all the rows excluding the first. 
Continue the process until the matrix is in echelon form. 

Remark: The term row reduce shall mean to transform by elementary row operations. 

Example 3.10: The following matrix A is row reduced to echelon form by applying the operations 
R2 ^ -2Ri + ^2 and i?3 ^ -3fii + R3, and then the operation R3 -» -SKj + 4^3: 




1 


2 


-3 











4 


2 








5 


3 



1 


2 


-3 











4 


2 











2 



A=2 4-22to0042to 



Now suppose A = (oij) is a matrix in echelon form with distinguished elements 
aijj, . . . , Orj^. Apply the operations 

Rk -^ -ak^Ri + Oii-Rk, fc = 1, . . ., i- 1 
for i = 2, then i = 3, ...,i = r. Thus A is replaced by an echelon matrix whose dis- 
tinguished elements are the only nonzero entries in their respective columns. Next, multiply 
Ri by a~^, i~r. Thus, in addition, the distinguished elements are each 1. In other words, 
the above process row reduces an echelon matrix to one in row reduced echelon form. 

Example 3.11: On the following echelon matrix A, apply the operation R^ -^ — 4^2 + 3i2i and then 
the operations fii -♦ ^3 + Bi and R^ -> — 5K3 + 2i22: 

/2 3 4 5 6\ /6 9 7 -2\ /6 9 7 0^ 

A=0 3 2 5toO 3 2 5to0 6 4 
\0 2/ \0 2/ \0 2/ 

Next multiply Ri by 1/6, R2 by 1/6 and ^3 by 1/2 to obtain the row reduced echelon 

matrix 

/l 3/2 7/6 0\ 

12/3 
\0 1/ 

The above remarks show that any arbitrary matrix A is row equivalent to at least one 
row reduced echelon matrix. In the next chapter we prove, Theorem 4.8, that A is row 
equivalent to only one such matrix; we call it the row canonical form of A. 



CHAP. 3] MATRICES 43 

SQUARE MATRICES 

A matrix with the same number of rows as columns is called a square matrix. A square 
matrix with n rows and n columns is said to be of order n, and is called an n-square matrix. 
The diagonal (or: main diagonal) of the n-square matrix A = (Oij) consists of the elements 

an, a22, . . . , ftjin. 

/l 2 3^ 
Example 3.12: The following is a 3-square matrix: 4 5 6 

\7 8 9, 

Its diagonal elements are 1, 5 and 9. 

An upper triangular matrix or simply a triangular matrix is a square matrix whose 
entries below the main diagonal are all zero: 



/an 


ai2 . 


. ain\ 




/ail 


ai2 . 


. am 





O22 


ain 


or 




a22 


. a2n 



\0 ... ann/ 

Similarly, a lower triangular matrix is a square matrix whose entries above the main 
diagonal are all zero. 

A diagonal matrix is a square matrix whose non-diagonal entries are all zero: 



/a, ... \ 
a2 ... 

\o ... a„/ 



'ai 
or ' "-^ 



an 



In particular, the n-square matrix with I's on the diagonal and O's elsewhere, denoted by /« 
or simply /, is called the unit or identity matrix; e.g., 

/l 0^ 
h = 10 
\0 1, 

This matrix I is similar to the scalar 1 in that, for any n-square matrix A, 

AI = lA = A 

The matrix kl, for a scalar k G K, is called a scalar matrix; it is a diagonal matrix whose 
diagonal entries are each k. 

ALGEBRA OF SQUARE MATRICES 

Recall that not every two matrices can be added or multiplied. However, if we only 
consider square matrices of some given order n, then this inconvenience disappears. Specif- 
ically, the operations of addition, multiplication, scalar multiplication, and transpose can be 
performed on any nxn matrices and the result is again an n x n matrix. 

In particular, if A is any n-square matrix, we can form powers of A: 

A^ = AA, A^ = A^A, . .. and A" = / 

We can also form polynomials in the matrix A: for any polynomial 

f{x) = ao + ai* + UiX^ + • • • + ttnX" 



44 MATRICES [CHAP. 3 

where the aj are scalars, we define f(A) to be the matrix 

/(A) = aol + aiA + a2A^ + • • • + a„A" 
In the case that f{A) is the zero matrix, then A is called a zero or root of the polynomial f{x). 

Example 3.13: Let A = (J _l); then A^ = (J J)(J _^2) = (_^ "^ 
If f(x) = 2a;2 - 3a; + 5, then 

If g{x) = x^ + 3x- 10, then 

''^' = (J ^:) - Ka -I) - < :) ^ c : 

Thus A is a zero of the polynomial g(x). 



INVERTIBLE MATRICES 

A square matrix A is said to be invertible if there exists a matrix B with the property 

that 

AB = BA = I 

where / is the identity matrix. Such a matrix B is unique; for 

ABi - BiA = / and AB2 = B2A = I implies Bi = BJ - BiiABz) = iBiA)Bi = IB2 = B2 

We call such a matrix B the inverse of A and denote it by A~*. Observe that the above 
relation is symmetric; that is, if B is the inverse of A, then A is the inverse of B. 



Example 3.14: 



2 5\/ 3 -5\ _ /6-5 -10 + 10 
1 3/1^-1 2) ~ 1^3-3 -5 + 6 

3 -5\/2 5\ _ / 6-5 15-15 
-1 2j\l s) " V-2 + 2 -5 + 6 

,'2 5\ / 3 -5\ 

Thus ( , „ ) and | ^ „ ) are invertible and are inverses of each other. 



1 








1 


1 








1 



1 3/"""\^-i 2 

We show (Problem 3.37) that for square matrices, AB = I if and only if BA = /; hence 
it is necessary to test only one product to determine whether two given matrices are in- 
verses, as in the next example. 

-11 + + 12 2 + 0-2 2 + 0-2 

Example 3.15: (2-1 3)(-4 l|=|-22 + 4 + 18 4 + 0-3 4-1-3 

-44-4 + 48 8 + 0-8 8 + 1-8 

Thus the two matrices are invertible and are inverses of each other. 

fa b\ 
We now calculate the inverse of a general 2x2 matrix A — { 1 . We seek scalars 

X, y, z, w such that ^ '' 

a ^\( ^ y\ _ /l 0\ fax + bz ay + bw\ _ /l 

cd)\zwj~\0l) ^^ \cx + dz cy + dwj " \0 1 





CHAP. 3] MATRICES 45 

which reduces to solving the following two systems of linear equations in two unknowns: 

iax + bz = 1 jay + bw = 

\cx + d2 = \cy + dw = 1 

If we let |A| = ad — be, then by Problem 2.27, page 33, the above systems have solutions if 
and only if \A\ ¥= 0; such solutions are unique and are as follows: 

d d _ —b _ ^ _ — c — z£. — "' ^ 



ad -be \A\' " ad -be \A\' ad - be \A\' ad -be \A\ 

d/\A\ -b/\A\\ _ \ ( d -b" 



Accordingly, .. - , ,i^i i - i^ii 

'^-c/|A| 0'l\A\J \A\\^-c a 

Remark: The reader no doubt recognizes \A\ = ad — bc as the determinant of the matrix 
A; thus we see that a 2 x 2 matrix has an inverse if and only if its determinant 
is not zero. This relationship, which holds true in general, will be further 
investigated in Chapter 9 on determinants. 



BLOCK MATRICES 

Using a system of horizontal and vertical lines, we can partition a matrix A into smaller 
matrices called bloeks (or: eells) of A. The matrix A is then called a block matrix. Clearly, 
a given matrix may be divided into blocks in different ways; for example, 



1 


-2 
3 
1 



5 

4 


1 3\ 
7-2 = 
5 9/ 


1', 


-2 
3 



5 


1 j 3\ 
7 |-2 = 


/I 


-2 


"1 ' 


3 


2 


2 


3 


5]7 


-2 


3 


\s 


1 


4 


5 1 9/ 


\3 


1 


4 1 5 


9 



The convenience of the partition into blocks is that the result of operations on block matrices 
can be obtained by carrying out the computation with the blocks, just as if they were the 
actual elements of the matrices. This is illustrated below. 



Suppose A is partitioned into blocks; say 




Ain 
Ain 



Multiplying each block by a scalar k, multiplies each element of A by k; thus 

(kAii. kAi2 . . . kAm 
feAai &A22 . . . kA2n 
iCAjnl rCAjn2 . . . iCAmn j 

Now suppose a matrix B is partitioned into the same number of blocks as A; say 



B = 



j Bn B12 . . . Bin 

B21 B22 . . . B2n 

\^Bml Bm2 ... B, 



46 MATRICES [CHAP. 3 

Furthermore, suppose the corresponding blocks of A and B have the same size. Adding 
these corresponding blocks, adds the corresponding elements of A and B. Accordingly, 



/All + fill Ai2 + Bi2 ... Am + Em 

A I I> — I -^21 + -^21 A22 + B22 . . . Aln + Bin 

\Am\ + Bm\ Am2 + Bm2 • • ■ Amn + Bm 

The case of matrix multiplication is less obvious but still true. That is, suppose matrices 
U and V are partitioned into blocks as follows 

U12 ... C/ip\ /Fa 7i2 ... Vin\ 

JJ ^ I -Zl C/22 ... U2P ^^^ y ^ I V2I V22 ... V2: 



Vmi ... Umpj \Vj,l F22 ... Fpn/ 



such that the number of columns of each block Uik is equal to the number of rows of each 
block Vkj. Then 

'Wn Wn ... Wm 
W-n W22 ... Wzn 



where Wa = UnVn + Ui2V2i + • ■ • + UipVpj 

The proof of the above formula for UV is straightforward, but detailed and lengthy. It 
is left as a supplementary problem (Problem 3.68). 



Solved Problems 



MATRIX ADDITION AND SCALAR MULTIPLICATION 


3.1. 


Compute: 
















-{I 


2 -3 4^ 
-5 1 -1 


\ /3 -5 

J I2 - 


6 -1\ 
-2 -3y 










^^ [1 - 


-1 1) ^ 


/3 5\ 


(iii) 


-{I 


2 

-5 


-3 
6 




(i) Add corresponding entries: 














n 2- 

\Q -5 


I -0 - (I 


-5 6 - 
-2 - 


-^) 












/I + 3 


2-5 


-3 + 6 


4- 


- l^ 








(0 + 2 


-5 + 


1-2 


-1 - 


-3y 



4-333 
2 -5 -1 -4 



(ii) The sum is not defined since the matrices have different shapes. 
(iii) Multiply each entry in the matrix by the scalar —3: 

0/1 2 -3N _ / -3 -6 9 

^ ' ' - - ' - ' -12 15-18 



CHAP. 3] MATRICES 47 

«« T . . 72 -5 1\ „ /I -2 -3\ /O 1 -2\ 

3^. Let A = (3 0-4)'^ = (0-1 5J'^ = (l-l-lj-F^"<i 3A + 4B-2C. 

First perform the scalar multiplication, and then the matrix addition: 

„^ , .„ „^ /6 -15 3\ /4 -8 -12\ / -2 4\ /lO -25 -5\ 

3A + 4B - 2C = (^ -12) + (0 -4 2o} + (-2 2 2) = ( 7 -2 lo) 



„ fxy\ / X 6 \ / 4 X + y 

3.3. Fmda;,i/,zandwif 3 = + 

^ z w I \ —1 2w \z + w 3 



First write each side as a single matrix: 



/3a; 3y\ _ / x + 4 
\3z BwJ ~ \z + w-l 



X + y + 6 
2w + 3 



Set corresponding entries equal to each other to obtain the system of four equations, 

3as = « + 4 2a; = 4 

3y = X + y + 6 2y = 6 + x 

or 

3z = z + w — 1 2z = w — 1 

3w = 2w + 3 w = 3 
The solution is: x = 2, j/ = 4, z = 1, w = 3. 



3.4. Prove Theorem 3.1(v): Let A and B be mxn matrices and k a scalar. Then 
kiA+B) = kA + kB. 

Suppose A — (Ojj) and B — (bij). Then Oy + 6jj is the y-entry of A + B, and so &(ajj + 6^) 
is the v-entry of k(A +B). On the other hand, ka^j and fcfty are the ij-entries of kA and kB respec- 
tively and so fcay + fe6y is the ti-entry of kA + kB. But k, ay and &„• are scalars in a field; hence 

k(aij + 6jj) = fcfflij + kbij, for every i, j 

Thus k(A + B) = kA + kB, as corresponding entries are equal. 

Remark: Observe the similarity of this proof and the proof of Theorem l.l(v) in Problem 1.6, page 
7. In fact, all other sections in the above theorem are proven in the same way as the 
corresponding sections of Theorem 1.1. 



MATRIX MULTIPLICATION 

3.5. Let (r x s) denote a matrix with shape rxs. Find the shape of the following products 
if the product is defined: 

(i) (2x3)(3x4) (iii) (1 x 2)(3 x 1) (v) (3 x 4)(3 x 4) 

(ii) (4xl)(lx2) (iv) (5 x 2)(2 x 3) (vi) (2 x 2)(2 x 4) 

Recall that an m X p matrix and a qXn matrix are multipliable only when p = q, and then 
the product is an m X n matrix. Thus each of the above products is defined if the "inner" numbers 
are equal, and then the product will have the shape of the "outer" numbers in the given order. 



(i) The product 
(ii) The product 



is a 2 X 4 matrix, 
is a 4 X 2 matrix. 



(iv) The product 
(v) The product 



(iii) The product is not defined since the inner numbers 2 and 3 are not equal. 



is a 5 X 3 matrix. 

is not defined even though the matrices have the same shape. 



(vi) The product is a 2 X 4 matrix, 



48 MATRICES [CHAP. 3 



3.6. Let ^ = (2 _!) and ^ ^ ( 3 -2 6 ) " ^^^^ ^^^ ^^' ^"^ ^^' 

(i) Since A is 2 X 2 and B is 2 X 3, the product AB is defined and is a 2 X 3 matrix. To obtain the 

/2\ / 
entries in the first row of AB, multiply the first row (1, 3) of A by the columns 



_4S x3y'V-2 

and ( - 1 of B, respectively: 

1 S\/2 0-4\ _ /1-2 + 3-3 1-0 + 3- (-2) 1 • (-4) + 3 • 6 

2 -1 j(^ 3 -2 6 y ~ V 

2 + 9 0-6 -4 + 18\ /ll -6 14 



To obtain the entries in the second row of AB, multiply the second row (2, —1) of A by the 
columns of B, respectively: 

1 3 y 2 -4 N ^ / 11 -6 14 \ 

2 -1 A 3 -2 ey ~ V2-2 + (-l)-3 2 • + (-1) • (-2) 2 • (-4) + (-1) • 6/ 

Thus ^^ = ( 1 2-14 

(ii) Note that J5 is 2 X 3 and A is 2 X 2. Since the inner numbers 3 and 2 are not equal, the product 
BA is not defined. 



3.7. Given A = (2,1) and B = /^ 5-3)' *"^ ^^^ ^^' ^"^ ^^* 

(i) Since A is 1 X 2 and B is 2 X 3, the product AB is defined and is a 1 X 3 matrix, i.e. a row 
vector with 3 components. To obtain the components of AB, multiply the row of A by each 
column of B: 

AB = (%,!)( \ ■"! ® ) = (2 • 1 + 1 . 4, 2 • (-2) + 1 • 5, 2 • + 1 • (-3)) = (6, 1, -3) 
\ 4 5 — 8 / 

(ii) Note that B is 2 X 3 and A is 1 X 2. Since the inner numbers 3 and 1 are not equal, the product 
BA is not defined. 



3.8. Given A = ( 1 and B = [^ ^ ^V find (i) AB, (ii) BA. 

(i) Since A is 3 X 2 and B is 2 X 3, the product AB is defined and is a 3 X 3 matrix. To obtain the 
first row of AB, multiply the first row of A by each column of B, respectively: 

2 -1 \ , /2-3 -4-4 -10 + 0\ 1-1 -8 -10^ 



\ " ( 3 1 "o ) = 



-3 

To obtain the second row of AB, multiply the second row of A by each column of B, 
respectively: 

/ 2 -1 \ , / -1 -8 -10 \ /-I -8 -10^ 

1-2-5 

3 4 

\ -6 i / 

To obtain the third row of AB, multiply the third row of A by each column of B, respectively: 



1 1( ^ ^ ^ ] = I 1 + -2 + -5 + I = 11 -2 -5 



CHAP. 3] 



MATRICES 



49 



"(s -:-:) = 



-1 -8 -10 

1 -2 -5 

-3 + 12 6 + 16 15 + i 



1 


-8 


-10 


1 


-2 


-5 


9 


22 


15 



Thus 



AB 




(ii) Since 5 is 2 X 3 and i4 is 3 X 2, the product BA is defined and is a 2 X 2 matrix. To obtain the 
first row of BA, multiply the first row of B by each column of A, respectively: 




2-2 + 15 -1 + 0-20 



15 -21 



To obtain the second row of BA, multiply the second row of B by each column of A, respectively: 

)l ' "« 

'15 -21 > 
.10 



-2 -5 
4 



15 -21 

6 + 4 + -3 + + 



/15 -21 
(lO -3 



Thus 



BA 



"-1) 



Remark: Observe that in this case both AB and BA are defined, but they are not equal; in fact they 
do not even have the same shape. 



3.9. Let A = 













/I 


-4 





1 


2 


-1 


o\ 




















and 


B = 


2 


-1 


3 


-1 


1 











\4 





-2 


0, 



(i) Determine the ahaite of AB. (ii) Let Ca denote the element in the ith row and 
:;th column of the product matrix AJB, that is, AB = (co). Find: c^a, Cu and C21. 

(i) Since A is 2 X 3 and £ is 3 X 4, the product AB is a 2 X 4 matrix. 

(ii) Now Cy is defined as the product of the ith row of A by the ith column of B. Hence: 



= 1 • + • 3 + (-3) • (-2) = + + 6 = 6 




c,4 - (2,-1,0) 



2-1 + (-l)'(-l) + 0-0 = 2 + 1 + = 



C21 = (1, 0, -3) 



= 1 • 1 + • 2 + (-3) -4 = 1 + 0-12 = -11 



3.10. Compute: 



(i) 



(ii) 



1 6\/4 
-3 5/(2 -1 



1 6 
-3 5 



2 

-7 



(iii) 
(iv) 



-^(-l 



(3,2) 



(V) (2,-1) 



-6 



(i) The first factor is 2 X 2 and the second is 2 X 2, so the product is defined and is a 2 X 2 matrix: 



50 MATRICES 



[CHAP. 3 



1 6Y4 0\ ^ / l'4 + 6-2 l-0 + 6'(-l) \ _ / 16 -6' 

^-3 5A2 -1/ V(-3)-4 + 5.2 (-3)-0 + 5-(-l)y ~ [-2 -5^ 

(ii) The first factor is 2 X 2 and the second is 2 X 1, so the product is defined and is a 2 X 1 matrix: 

1 ^V 2\ _ / 1-2 + 6- (-7) \ _ /-40' 
-3 5A-7; \(-3)'2 + 5'(-7)) ^ [-41^ 

(iii) Now the first factor is 2 X 1 and the second is 2X2. Since the inner numbers 1 and 2 are 
distinct, the product is not defined. 

(iv) Here the first factor is 2 X 1 and the second is 1 X 2, so the product is defined and is a 2 X 2 
matrix: 

3 2^ 



{>'' = il--l i:i) = 



18 12 



(v) The first factor is 1 X 2 and the second is 2 X 1, so the product is defined and is a 1 X 1 matrix 
which we frequently write as a scalar. 

(2,-l)(^_g) = (2-1 + (-1). (-6)) = (8) = 8 

3.11. Prove Theorem 3.2(i): (AB)C = A{BC). 

Let A = (oy), B = (bfl,) and C = (e^). Furthermore, let AB = S = (sj^) and BC = T = (t,,). 
Then 

Sjfc = ajiftifc + at2b2k + • • • + at„6mfc = 2 Oyftjj. 

3=1 

n 

hi = ^ji'^ii + bjiCn + • • • + bj„c„i = 2 fcjfcCfci 

lc=l 

Now multiplying S by C, i.e. (AB) by C, the element in the ith row and Ith column of the matrix 
{AB)C is 



n m 



»ilCll + SJ2C21 + • • • + Si„C„i = 2 StfcCfcl =22 {"'ifijkiOkl 

k=l fc=l j=l 

On the other hand, multiplying A by T, i.e. A by BC, the element in the tth row and fth column 
of the matrix A{BC) is 

tn m n 

»il*ll + «i2*2! + • • • + aim*ml = 2 »ij*jl =22 ««(6jfcCfci) 

Since the above sums are equal, the theorem is proven. 

3.12. Prove Theorem 3.2(ii): A{B + C) = AB + AC. 

Let A = (tty), B = (6jfc) and C = (Cj^). Furthermore, let D = B + C= (dj^), E = AB = (ej^) 
and F = AC = (fik)- Then 

djk = 6jfc + Cjfc 

m 

e* = aii6ik + ai2*'2fc + • ■ • + ajm^mk - 2 «ij6jic 

j=i 

m 

/ifc = Ojl^lfc + <»i2<'2fc + ■ • • + "■vm^mk — 2 "ijCjfc 

3 = 1 

Hence the element in the ith row and feth column of the matrix AB + AC is 

m tn m 

«ik + fik - 2 ayftjfc + 2 ttyCj-fc = 2 «i)(6jic + c^k) 
i=l 3=1 j=l 

On the other hand, the element in the ith row and fcth column of the matrix AD = A(B + C) is 

m m 

Ojidifc + aisd^k + ••• + otmdmk = 2 o-ad-jk = 2 a.ij(bjk + Cjk) 

}=l i=l 

Thus A{B + C) — AB + AC since the corresponding elements are equal. 



CHAP. 3] MATRICES 51 

TRANSPOSE 

3.13. Find the transpose A* of the matrix A = 



Rewrite the rows of A as the columns of A': 




3.14. Let A be an arbitrary matrix. Under what conditions is the product AA* defined? 

Suppose A is an m X n matrix; then A* is n X m. Thus the product AA* is always defined. 
Observe that A*A is also defined. Here AA* is an w X m matrix, whereas A*A is an n X n matrix. 



3.15. Let ^ = ( 3 _! 4) • Find (i) AA\ (ii) A*A. 



'1 31 

To obtain A*, rewrite the rows of A as columns: A* = 2 — 1 | . Then 

/I 3- ^^ 

- - G -' :)(: -: 

l'l + 2'2 + 0'0 1-3 + 2'(-l) + 0'4 \ _ /5 1 

3'1 + (-1)'2 + 4-0 3'3 + (-1) •(-!) + 4-4/ ~ \1 26 



A*A 



- I -i (3 -I :) 




1-1 + 3-3 l'2 + 3'(-l) l'0 + 3-4 \ 

2 • 1 + (-1) '3 2 • 2 + (-1) • (-1) 2 • + (-1) • 4 
0«l + 4'3 0-2 + 4' (-1) • + 4 • 4 j 



3.16. Prove Theorem 8.3(iv): {AB)* = B*AK 

Let A = (oy) and B = (bj^). Then the element in the ith row and jth column of the matrix 
AB is 

anbij + ai^hzj + • • • + ai^h^j (1) 

Thus (1) is the element which appears in the jth row and ith column of the transpose matrix (AB)*. 
On the other hand, the jth row of B* consists of the elements from the jth column of B: 

(6„ bzj ... 6„j) (2) 

Furthermore, the tth column of A* consists of the elements from the ith row of A: 



(3) 



Consequently, the element appearing in the ;th row and ith column of the matrix B*A* is the 
product of (2) by (S) which gives (1). Thus (AB)* = B*A*. 





52 MATRICES [CHAP. 3 

ECHELON MATRICES AND ELEMENTARY ROW OPERATIONS 
3.17. Circle the distinguished elements in each of the following echelon matrices. Which 
are row reduced echelon matrices? 

/l 2 -3 l\ /O 1 7 -5 0\ 
5 2-4, 1, 
\0 7 3/ \0 0/ 

The distinguished elements are the first nonzero entries in the rows; hence 

[l)2-3 0l\ /0©7-5 0\ /0O5O2^ 

0(1)2 -4, 00,00 20 4 

3/ \0 0/ \0 7, 

An echelon matrix is row reduced if its distinguished elements are each 1 and are the only nonzero 
entries in their respective columns. Thus the second and third matrices are row reduced, but the 
first is not. 



/I -2 3 -1\ 
3.18. Given A = 2 -1 2 2 . (i) Reduce A to echelon form, (ii) Reduce A to row 
\3 1 2 3/ 
canonical form, i.e. to row reduced echelon form. 

(i) Apply the operations ^2 -* -2i?i + Rz and R^ ^ -ZRi + ^3. and then the operation 
iJj -> -7B2 + 3B3 to reduce A to echelon form: 

'1 -2 3 -1^ 
Ato|0 3-4 4|to|0 3-4 4 

^0 7 -lOy 

ii) Method 1. Apply the operation B, -» 2i?2 + ZRi, and then the operations iSj -^ -^3 + 7i?i 
and Ri ^ 4/?3 + 7i?2 to the last matrix in (i) to further reduce A: 

IZX 45^ 
to I 3-4 4 I to I 21 0-12 

7 -10 J 

Finally, multiply Bj by 1/21, R^ by 1/21 and fig by 1/7 to obtain the row canonical form of A: 

'1 15/7^ 

10 -4/7 

^0 1 -10/7 J 

Method 2. In the last matrix in (i), multiply R^ by 1/3 and fig by 1/7 to obtain an echelon 
matrix where the distinguished elements are each 1: 

'1-2 3 -1 
1 -4/3 4/3 
^0 1 -10/7/ 

Now apply the operation R^ -» 2^2 + Ru and then the operations R2, -» (4/3)fi3 + R^ and 
jBj -^ {—1/3)R3 + jBi to obtain the above row canonical form of A. 

Remark: Observe that one advantage of the first method is that fractions did not appear 
until the very last step. 





CHAP. 3] MATRICES 53 

/O 1 3 -2\ 
3.19. Determine the row canonical form of A = 2 1-4 3 

\2 3 2 -ll 

1^ 1 -4 Z\ I2 1 

A to 




Note that the third matrix is already in echelon form. 

/ 6 3 -4\ 
3.20. Reduce A = -4 1 -6 to echelon form, and then to row reduced echelon form, 
\ 1 2-5/ 
i.e. to its row canonical form. 

The computations are usually simpler if the "pivotal" element is 1. Hence first interchange the 
first and third rows: 

A to 




Note that the third matrix is already in echelon form. 

3.21. Show that each of the following elementary row operations has an inverse operation 
of the same type. 



m 



Interchange the ith row and the jth. row: Ri <^ Rj. 

Multiply the zth row by a nonzero scalar k: Ri -» kRi, fc ^ 0. 

Replace the ith row by k times the jth row plus the ith row: Ri ^ kRj + Ru 



(i) Interchanging the same two rows twice, we obtain the original matrix; that is, this operation 
is its own inverse. 

(ii) Multiplying the ith row by k and then by fc-i, or by fc-i and then by k, we obtain the original 
matrix. In other words, the operations iJj -» kRi and i?j -^ fe-iiJj are inverses. 

(iii) Applying the operation flj -» kRj + Ri and then the operation fij -^ —kRj + fij, or apply- 
ing the operation fij -* -kRj + i?j and then the operation fij -» kRj + flj, we obtain the orig- 
inal matrix. In other words, the operations Ri -» kRj + fij and iJj -» —kRj + flj are 
inverses. 

SQUARE MATRICES 

3.22. Let A = ( ^ _g ) . Find (i) A^ (ii) A*, (iii) /(A), where fix) = 2a^ - 4x + 5. 

1 2 
4 -3 



^4 -3^ 

(i) A^ = AA = 



)il 4) 

/ 1-1 + 2-4 1-2 + 2 -(-3) \ ^ / 9 -4\ 

V4-l + (-3)-4 4-2 + (-3) -(-3)/ ~ [-8 17/ 



54 MATRICES [CHAP. 3 

« - = -' = c -:)(-: ») 

/ l-9 + 2-(-8) l-(-4) + 2-17 \ ^ /-7 30\ 

(^4-9 + (-3) -(-8) 4 -(-4) + (-3) -17/ \eO -67 J 

(iii) To find /(A), first substitute A for x and 57 for the constant 5 in the given polynomial 
f(x) = 2x9 _ 4a; + 5: 

/(A) = 2A3-4A + 5/ = 2(-; -Z) - ^ {\ -s) + K'o l) 

Then multiply each matrix by its respective scalar: 



/-14 60\ / -4 -8\ /5 0\ 
1^120 -134y V-16 12/ \0 5y 



Lastly, add the corresponding elements in the matrices: 

_ / -14 -4 + 5 60-8 + \ _ I' -IS 
" 1^120-16 + -134 + 12 + 5/ 



104 



52 \ 
-117/ 



3.23. Referring to Problem 3.22, show that A is a zero of the polynomial g{x) = a;^ + 2a! - 11. 
A is a zero of g(x) if the matrix g(A) is the zero matrix. Compute g(A) as was done for /(A), 
i.e. first substitute A for x and 11/ for the constant 11 in g(x) = x^ + ^x- 11: 

„,., = ..... -n, = (-:.)-(! 4) -"GO 

Then multiply each matrix by the scalar preceding it: 

/ 9 _4x /2 4\ , /-ll 

g{A) 



^ / 9 -4X /2 < 

V-8 17; V8 -( 



6/ V -11 

Lastly, add the corresponding elements in the matrices: 

/9 + 2-11 -4 + 4 + 0\ ^ /O 
g{A) - ^_g^8^.Q i7_g_;^iy (^0 

Since g{A) = 0, A is a zero of the polynomial g(x). 



3.24. Given A - [ ) . Find a nonzero column vector u = I ) such that Au - 3m. 

First set up the matrix equation Au = 3u: 

4 -3/U/ ~ ^\y 

Write each side as a single matrix (column vector): 

/ x + 3y\ ^ /Sx^ 
\Ax-3yJ V3j/y 

Set corresponding elements equal to each other to obtain the system of equations (and reduce to 
echelon form): 

a; + 3J, = 3a; __ 2x - 3y = _ 2x - Sy = ^^ 2x - Sy = 



Ax-Zy - Zy Ax- 6y = 0-0 

The system reduces to one homogeneous equation in two unknowns, and so has an infinite number 
of solutions. To obtain a nonzero solution let, say, ?/ = 2; then » = 3. That is, a; = 3, i/ = 2 is a 

solution of the system. Thus the vector w = ( g j is nonzero and has the property that Au = 3m. 



CHAP. 3] MATRICES 55 

/3 5^ 

3.25. Find the inverse of f ^ „ 

l2 3 



Method 1. We seek scalars x, y, z and w for which 

,2 3/\2 w/ " \0 l) *"" \2x + Sz 2y + 3w) 

,.,..» r3a; + 5« = 1 [31/ + 5w = 

or which satisfy < and ■( 

l2a; + 3« = \2j/ + 3w = 1 



GO 



The solution of the first system is a; = —3, z — 2, and of the second system is 2/ = 5, w = —3. 

/-3 5\ 
Thus the inverse of the given matrix is 1 ) . 

Method 2. We derived the general formula for the inverse A-^ oi the 2X2 matrix A 



Ttaslf A=Q ly ,h.„ |A| = 9-1(1 = -1 .„d A-. = -l(4 -^) = (-J 4). 



A^i = rTT ( 1 where lAI = ad — 6c 

1^1 V-c a' 



MISCELLANEOUS PROBLEMS 

3.26, Compute AB using block multiplication, where 

2 I 1\ 
3 4 I 1 and B = 

\0 I 2, 



Hence 



„ ^1 and S = ( ) where E, F, G, R, S and T are the given blocks. 

GJ \0 TJ 



ER ES + FT\ 
GT J ~ 


//9 12 15N /3N /I 
Vl9 26 33/ V^/ \0 






\ ( 0) (2) 



AB = (^^^ ^^^+/^) = jVl9 26 33y V^yVoyj = ji9 26 33 7 



3.27. Suppose B = {Ri, R2, . . ., i?„), i.e. that Ri is the ith row of B. Suppose BA is de- 
fined. Show that BA = (RiA, RzA, . . .,RnA), i.e. that RiA is the ith row of BA. 

Let i4i, A2, . . ., A"» denote the columns of A. By definition of matrix multiplication, the ith row 
of BA is {Ri •A\Ri'A\ • . . , i2i • A"). But by matrix multiplication, BjA = (Bj • A^, i?i • A2, . . . , 
Bj-A™). Thus the ith row of BA is ftjA. 

3.28. Let Ci = (0, . . . , 1, . . . , 0) be the row vector with 1 in the tth position and else- 
where. Show that CjA = Ri, the ith row of A. 

Observe that Cj is the ith row of /, the identity matrix. By the preceding problem, the tth row 
of I A is BjA. But lA = A. Accordingly, CjA = JBj, the ith row of A. 

3.29. Show: (i) If A has a zero row, then AB has a zero row. 

(ii) If B has a zero column, then AB has a zero column. 

(iii) Any matrix with a zero row or a zero column is not invertible. 

(i) Let jBj be the zero row of A, and B^, . . .,£" the columns of B. Then the ith row of AB is 

{RrB\ Ri'B^ ..., Ri-B^) = (0, 0, ..., 0) 



56 MATRICES [CHAP. 3 

(ii) Let Cj be the zero column of B, and Aj, . . ., A„ the rows of A. Then the jth column of AB is 

/Ai-C/ 
A^'Cj 

m'Cj 

(iii) A matrix A is invertible means that there exists a matrix A~^ such that AA"^ = A~^A —I. 
But the identity matrix / has no zero row or zero column; hence by (i) and (ii) A cannot have 
a zero row or a zero column. In other words, a matrix with a zero row or a zero column cannot 
be invertible. 

3.30. Let A and B be invertible matrices (of the same order). Show that the product 
AB is also invertible and (AB)-^ = B'^A'K Thus by induction, (AiA2- • -An^^ = 
An"* • • -Az^Ai^ where the Ai are invertible. 

(AB)(B-iA-i) = A(BB-i)A-i = A/A-i = AA i = / 

and (B-iA-i)(AB) = B-i(A-iA)B = B-^B = B^B = I 

Thus (AB)-i = B-iA-i. 

3.31. Let u and v be distinct vectors. Show that, for each scalar kGK, the vectors 
u + k{u — v) are distinct. 

It suffices to show that if 

u + ki{u — v) = M + k2(u — v) (1) 

then fcj = k^. Suppose (1) holds. Then 

ki(u — v) = k^iu — v) or {ki — k2)(u — v) = 

Since u and v are distinct, u — v¥'0. Hence fci — fcg — and fci = /Cj. 



ELEMENTARY MATRICES AND APPLICATIONS* 

3.32. A matrix obtained from the identity matrix by a single elementary row operation is 
called an elementary matrix. Determine the 8-square elementary matrices corre- 
sponding to the operations Ri <^ R2, Ra ^ —IRs and 722 -* — 3i?i + R2. 

/I o\ 

Apply the operations to the identity matrix /g = 1 to obtain 

\o 1/ 




£^1 = 1 , ^2 = 



/I 











1 





\o 





-7 




Eo = 



3.33. Prove: Let e be an elementary row operation and E the corresponding m-square elemen- 
tary matrix, i.e. E-e(l-m). Then for any TO X % matrix A, e{A) = EA. That is, the re- 
sult e(A) of applying the operation e on the matrix A can be obtained by multiplying 
A by the corresponding elementary matrix E. 

Let iJj be the tth row of A; we denote this by writing A = (B^ R^). By Problem 3.27, if 

B is a matrix for which AB is defined, then AB = (R^B, ..., R-^B). We also let 

ej = (0, ...,0,1,0 0), A = i 



*This section is rather detailed and may be omitted in a first reading. It is not needed except for certain 
results in Chapter 9 on determinants. 



CHAP. 3] MATRICES 57 

Here a = t means that 1 is the ith component. By Problem 3.28, e^A = iJj. We also remark that 
/ = (cj, . . ., e„) is the identity matrix. 

(i) Let e be the elementary row operation i^j «-> Rj. Then, for a = i and A — j, 

E - e(I) = (ej Bj ej, . . ., ej 

and e(A) = (iBj, . . .,£^, . . ., Rt BJ 

Thus /\ ^ . /s A 

^A = (fijA, ...,e^, ...,6iA, ...,e^A) = (fii, . . ., i?,-, . . ., ffj fij = e(A) 

(ii) Now let e be the elementary row operation jB^ -> fcjBj, fc t^ 0. Then, for a = i, 

E = e(/) = (6i, ...,fcej, ..., ej and e(A) = (ftj, . . ., fcfij, . . ., BJ 

/\ /\ 

Thus ^-A = (fijA, ...,A;ejA, ...,e^A) = (fij, . . . , fefij, . . . , «„) = e(A) 

(iii) Lastly, let e be the elementary row operation JBj ^ kRj + Kj. Then, for /\ — i, 

E = e(I) = (ei, ...,fcej + ej, ...,6j and e(A) =: (fij, . . . , fcfij + Bj, . . . , i2 J 
Using (ftej + ej)A = fc(ej.A) + BjA - kRj + Rf, we have 

EA = (M, ...,(fce^ + ei)A, ...,e„A) = (R^, . . ., kRj + Ri, . . ., RJ = e(A) 
Thus we have proven the theorem. 

3^. Show that A is row equivalent to B if and only if there exist elementary matrices 
El, ...,Es such that Es- ■ • E2E1A = B. 

By definition, A is row equivalent to B if there exist elementary row operations ej, . . ..e, for 
which es(---(e2(ei(A)))- ••) = B. But, by the preceding problem, the above holds if and only if 
Eg- • -E^EiA = B where JS7j is the elementary matrix corresponding to e^. 

3^5. Show that the elementary matrices are invertible and that their inverses are also 
elementary matrices. 

Let E be the elementary matrix corresponding to the elementary row operation e: e(I) = E. 
Let e' be the inverse operation of e (see Problem 3.21) and E' its corresponding elementary matrix. 
Then, by Problem 3.33, 

/ = e'(e(/)) = e'E = E'E and / = e(e'(I)) = eE' = EE' 

Therefore E' is the inverse of E. 

3M. Prove that the following are equivalent: 
(i) A is invertible. 

(ii) A is row equivalent to the identity matrix /. 
(iii) A is a product of elementary matrices. 

Suppose A is invertible and suppose A is row equivalent to the row reduced echelon matrix B. 
Then there exist elementary matrices Ei,E2, ■ ■ -yE^ such that Eg- • ■E2E1A = B. Since A is invert- 
ible and each elementary matrix E^ is invertible, the product is invertible. But if B ^ I, then B 
has a zero row (Problem 3.47); hence B is not invertible (Problem 3.29). Thus B = I. In other 
words, (i) implies (ii). 

Now if (ii) holds, then there exist elementary matrices E^, E^, . . .,Eg such that 

E.-'-E^EiA = /, and so A = (E,- ■ -E^Ei)-^ = E^^E^^-'-EJ^ 

By the preceding problem, the Ei are also elementary matrices. Thus (ii) implies (iii). 

Now if (iii) holds (A = EiE^- ■ -E^), then (i) must follow since the product of invertible 
matrices is invertible. 



58 MATRICES [CHAP. 3 

3.37. Let A and B be square matrices of the same order. Show that if AB = I, then 
B = A-K Thus AB = I if and only if BA = I. 

Suppose A is not invertible. Then A is not row equivalent to the identity matrix /, and so A 
is row equivalent to a matrix with a zero row. In other words, there exist elementary matrices 

El E^ such that E^- • -E^E^A has a zero row. Hence E^- ■ -EJE^AB has a zero row. Accordingly, 

AB is row equivalent to a matrix with a zero row and so is not row equivalent to /. But this con- 
tradicts the fact that AB = /. Thus A is invertible. Consequently, 

B = IB = (A'-->^A)B = A-HAB) = A'^I = A"! 



1.38. 



Suppose A is invertible and, say, it is row reducible to the identity matrix / by the 
sequence of elementary operations ei, . . ., e„. (i) Show that this sequence of elemen- 
tary row operations applied to / yields A-K (ii) Use this result to obtain the inverse 

/I 2\ 
of A = 2 -1 3 

\4 1 8/ 

(i) Let Ei be the elementary matrix corresponding to the operation ej. Then, by hypothesis and 
Problem 3.34, E„- • -E^EiA = I. Thus (E„- ■ ■EiEJ)A ^ I and hence A i = E^---EJEJ 
In other words, A -i can be obtained from / by applying the elementary row operations ej, 

(ii) Form the block matrix (A, I) and row reduce it to row canonical form: 

2 I 1 0\ /l 2 
(A, /) = [2-1 3 I 1 I to I -1 

1 8 I 



e« 



to 



2 I 1 
-1 I -2 1 
-1 I -6 1 




to 

Observe that the final block matrix is in the form (/, B). Hence A is invertible and B is its 
inverse: 

A-i = 

Remark: In case the final block matrix is not of the form (/, B), then the given matrix is not 
row equivalent to I and so is not invertible. 




Supplementary Problems 



MATRIX OPERATIONS 

In Problems 3.39-3.41, let 



-(o1 !)• -"i-l 



4 0-3 
2 3 





3.39. Find: (i) A + B, (ii) A + C, (iii) 3A - 4B. 

3.40. Find: (i) AB, (ii) AC, (iii) AZ), (W) BC, (y) BD, (wi) CD. 

3.41. Find: (i) A*, (ii) A'C, (iii) IJtA', (iv) B«A, (y) DW. {wi) DDK 



CHAP. 3] MATRICES 59 

61 &2 h h], find (i) ejA, 

C, C2 C3 C4/ 

3.43. Let Cj = (0, . ... 0, 1, 0, .... 0) where 1 is the ith component. Show the following: 
(i) Be*. = Cj, the ith column of B. (By Problem 3.28, ejA = Bj.) 

(ii) If e^A = ejB for each i, then A = B. 
(iii) If Ae\ = Be\ for each i, then A = B. 

ECHELON MATRICES AND ELEMENTARY ROW OPERATIONS 

3.44. Reduce A to echelon form and then to its row canonical form, where 

/l 2-1 2 l\ /2 3-2 5 1^ 

(i) A = 2 4 1-2 3 , (ii) A = 3 -1 2 4 

\3 6 2-6 5/ \4 -5 6 -5 Ij 

3.45. Reduce A to echelon form and then to its row canonical form, where 

/l 3 -1 2\ /O 1 3 -2^ 



(i) A 



11 -5 3 ,..^ ^ 4-13 

2-5 3 1 ■ (") ^=0021 

\4 1 1 5/ \0 5 -3 4, 



3.46. Describe all the possible 2X2 matrices which are in row reduced echelon form. 

3.47. Suppose A is a square row reduced echelon matrix. Show that if A # 7, the identity matrix, then 
A has a zero row. 

3.48. Show that every square echelon matrix is upper triangular, but not vice versa. 

3.49. Show that row equivalence is an equivalence relation: 
(i) A is row equivalent to A; 

(ii) A row equivalent to B implies B row equivalent to A; 

(iii) A row equivalent to B and B row equivalent to C Implies A row equivalent to C. 

SQUARE MATRICES 

3.50. "Let A = ( \ . (i) Find A^ and A3, (ii) If /(«) = vfl - Zx^ - 2x + i, find /(A), (iii) If 
g(x) = a;2 - a; - 8, find g(A). 

3.51. Let B = f ]. (i)U f(x) = 2x2 - 4x + Z, find f{B). (ii) If g{x) = x^ - 4x - 12, find g(B). 
(iii) Find a nonzero column vector u = ( ] such that Bu = 6m. 

3.52. Matrices A and B are said to commute if AB = BA. Find all matrices ( ^ ) which commute 

. , /I 1\ Vz w/ 

with ( 

VO 1 



3.53. Let A = ( ) . Find A". 



<ll) 



60 MATRICES [CHAP. 3 

3.54. Let A = ( „ „ and B = ( „ ,, 
\0 3/ \0 11 

Find: (i) A + B, (ii) AB, (iii) A^ and A3, (iv) A", (v) /(A) for a polynomial f{x). 

3.56. Suppose the 2-square matrix B commutes with every 2-square matrix A, i.e. AB = BA. Show that 

B = ( , ) for some scalar k, i.e. B is a scalar matrix. 
\0 kj 

3.57. Let Dfc be the m-square scalar matrix with diagonal elements k. Show that: 

(i) for any mXn matrix A, D^A = kA; (ii) for any nXm matrix B, BD^ = kB. 

3.58. Show that the sum, product and scalar multiple of: 
(i) upper triangular matrices is upper triangular; 
(ii) lower triangular matrices is lower triangular; 
(iii) diagonal matrices is diagonal; 

(iv) scalar matrices is scalar. 

INVERTIBLE MATRICES 

- -' '2 -3\ 



3.59. Find the inverse of each matrix: (i) ( ^ c ) > (") ( i 



3/ 



-1 2 -3\ 2 1 -V 

3.60. Find the inverse of each matrix: (i) | 2 1 , (ii) 2 1 

,4-2 5/ \5 2 -3/ 

1 3 4\ 
t.61. Find the inverse of | 3 -1 6 

l-l 5 1/ 

3.62. Show that the operations of inverse and transpose commute; that is, (A«)-i = (A-i)«. Thus, m 
particular, A is invertible if and only if A* is invertible. 

(»! ... 
°. ."! ///. . .M invertible, and what is its inverse? 


3.64. Show that A is row equivalent to B if and only if there exists an invertible matrix P such that 
B = PA. 

3.65. Show that A is invertible if and only if the system AX = has only the zero solution. 

MISCELLANEOUS PROBLEMS 

3.66. Prove Theorem 3.2: (iii) (B + QA = BA + CA; (iv) k(AB) = (kA)B = A(kB), where fc is a scalar. 
(Parts (i) and (ii) were proven in Problem 3.11 and 3.12.) 

3.67. Prove Theorem 3.3: (i) (A + B)* = A* + BH (ii) (A')« = A; (iii) (feA)' = kA*, for k a scalar. 
(Part (iv) was proven in Problem 3.16.) 

3.68. Suppose A = (A^) and B = (B^,) are block matrices for which AB is defined and the number of 
columns of each block Aj^ is equal to the number of rows of each block B^j. Show that AB - (Gy) 
where Cy = 2 Ag^B^j. 



CHAP. 3] MATRICES 61 

3.69. The following operations are called elementary column operations: 



[El] 

[^3] 



Interchange the tth column and the jth column. 

Multiply the tth column by a nonzero scalar A;. 

Replace the ith column by k times the jth column plus the ith column. 



Show that each of the operations has an inverse operation of the same type. 

3.70. A matrix A is said to be equivalent to a matrix BUB can be obtained from A by a finite sequence 
of operations, each being an elementary row or column operation. Show that matrix equivalence is 
an equivalence relation. 

3.71. Show that two consistent systems of linear equations have the same solution set if and only if their 
augmented matrices are row equivalent. (We assume that zero rows are added so that both aug- 
mented matrices have the same number of rows.) 



Answers to Supplementary Problems 



3.39. 


« i-l -1 1) 


(ii) 


Not defined. (iii) f 


-13 

4 


-3 18 \ 

17 0/ 




3.40. 


(i) Not defined. 




<"« C) 






<^' CI 




/-5 -2 4 
(") ( 11 -3 -12 


«') 


<"> m ": 



8 


1) 


(vi) Not 



1 0\ / 4 -7 4\ / 4 -2 

3.41. (i) I -1 3 I (ii) Not defined. (iii) (9, 9) (iv) 0-6-8 (v) 14 (vi) -2 1-3 

2 4/ \-3 12 6/ \ 6 -3 9y 

3.42. (i) (ai, ag. 03. a*) (ii) (61, 62. K K) (iii) (Ci. Cg, Cg, C4) 

'1 2 4/3^ 

3.44. (i) ( 3-6 1 | and ( 1 

^0 1 -1/6^ 

/2 3-2 5 l\ /l 4/11 5/11 13/11 \ 

(ii) -11 10 -15 5 and 1 -10/11 15/11 -5/11 

\0 0/ \0 / 



3.45. (i) r''-° " and 




/I 


3 


-1 


2 





11 


-5 


3 














^0 








0/ 


/o 


1 


3 


-2 








-13 


11 











35 











0/ 



(ii) L : and 



ll 





4/11 


13/11 





1 


-5/11 


3/11 





























1 





o\ 








1 














1 











0/ 



62 



MATRICES [CHAP. 3 



'■''■ Co o)'(2 D'Co I) ''Co J) -^^'■^ '^ '^ ^"^ «'=^'^" 



^0 1 V 
3.48. ( 1 1 ) Is upper triangular but not an echelon matrix. 
iO 1/ 



3.52. Only matrices of the form ( ] commute with (^ ^J ■ 



3.53. 


/I 2n\ 

^" = (o i) 


3.54. 


/9 

« ^+^ = (o 14 




„ /14 ON 



« - = c :)■ - = c .;) '" '<^' - (T ;,; 



2" 



(»)^^=(o 33; ('^^ ^"=U 3«, 

/3ci 3d,^ 



/ 5 -2\ ,.., / 1/3 1/3 

3.59. (1) (^ 3) (n) f ^/9 2/g 



1-5 4 -3\ / 8 -1 -3^ 



3.60. (i) 10-7 6 



11 



-5 12 



8-6 5/ \ 10 -1 -4y 



/31/2 -17/2 -11^ 
3.61. 9/2 -5/2 -3 

\-7 4 51 



3.62. Given AA-i = /. Then 7 = 7' = (AA-i)' = {A'^YAK That is, (A^^)' = (A*)"!. 



/a-i 



.0 a-i ... 
3.63. A is invertible iff each aj 9^ 0. Then A ^ - ' 



\0 



chapter 4 



Vector Spaces and Subspaces 



INTRODUCTION 

In Chapter 1 we studied the concrete structures B" and C" and derived various proper- 
ties. Now certain of these properties will play the role of axioms as we define abstract 
"vector spaces" or, as it is sometimes called, "linear spaces". In particular, the conclu- 
sions (i) through (viii) of Theorem 1.1, page 3, become axioms [A]]-[A4], [Afi]-[M4] below. 
We will see that, in a certain sense, we get nothing new. In fact, we prove in Chapter 5 
that every vector space over R which has "finite dimension" (defined there) can be identified 
with R" for some n. 

The definition of a vector space involves an arbitrary field (see Appendix B) whose 

elements are called scalars. We adopt the following notation (unless otherwise stated or 

implied): 

K the field of scalars, 

a, &, c or A; the elements of K, 

V the given vector space, 

u, V, w the elements of V. 

We remark that nothing essential is lost if the reader assumes that K is the real field R 
or the complex field C. 

Lastly, we mention that the "dot product", and related notions such as orthogonality, 
is not considered as part of the fundamental vector space structure, but as an additional 
structure which may or may not be introduced. Such spaces shall be investigated in the 
latter part of the text. 

Definition : Let iiT be a given field and let V be a nonempty set with rules of addition and 
scalar multiplication which assigns to any u,v GV a sum u + v GV and to 
any uGV,kGK a product ku G V. Then V is called a vector space over K 
(and the elements of V are called vectors) if the following axioms hold: 

[Ai]: For any vectors u,v,w GV, {u + v) +w = u + {v-i- w). 

[A2]: There is a vector in V, denoted by and called the zero vector, for which u + Q — u 
for any vector u GV. 

[A3] : For each vector uGV there is a vector in V, denoted by —u, for which u + {—u) = 0. 

[A4]: For any vectors u,v GV, u + v = v + u. 

[Ml]: For any scalar k G K and any vectors u,v GV, k{u + v) = ku + kv. 

[M2] : For any scalars a,b GK and any vector u GV, (a + b)u = au + bu. 

[Ms]: For any scalars a,b G K and any vector u GV, {ab)u = a{bu). 

[Mi]: For the unit scalar 1 G K, lu = u for any vector u GV. 

63 



64 VECTOR SPACES AND SUBSPACES [CHAP. 4 

The above axioms naturally split into two sets. The first four are only concerned with 
the additive structure of V and can be summarized by saying that 7 is a commutative group 
(see Appendix B) under addition. It follows that any sum of vectors of the form 

Vi + V2 + • • • + Vm 

requires no parenthesis and does not depend upon the order of the summands, the zero 
vector is unique, the negative —u of u is unique, and the cancellation law holds: 

u + w = V + w implies u — v 

for any vectors u,v,w G V. Also, subtraction is defined by 

u — V = u + {—v) 

On the other hand, the remaining four axioms are concerned with the "action" of the 
field K on V. Observe that the labelling of the axioms reflects this splitting. Using these 
additional axioms we prove (Problem 4.1) the following simple properties of a vector space. 

Theorem 4.1: Let 7 be a vector space over a field K. 

(i) For any scalar kGK and G 7, fcO = 0. 

(ii) For gK and any vector uGV, Ou = 0. 

(iii) If ku ^ 0, where kGK and uGV, then A; = or m = 0. 

(iv) For any scalar kGK and any vector uGV, {-k)u = k{-u) = -ku. 

EXAMPLES OF VECTOR SPACES 

We now list a number of important examples of vector spaces. The first example is a 
generalization of the space R". 

Examplje 4.1: Let K be an arbitrary field. The set of all n-tuples of elements of K with vector 
addition and scalar multiplication defined by 

(«!, a2 a„) + (61,62, ...,6„) = (01 + 61,02+62 a„+6„) 

and fc(ai. <»2. • ■ • . «n) = C^^i- '««2. • • • . ^O 

where «<, 64, k&K, is a vector space over K; we denote this space by X". The zero 
vector in K» is the w-tuple of zeros, = (0, 0, ... , 0). The proof that K" is a vector 
space is identical to the proof of Theorem 1.1, which we may now regard as stating 
that R" with the operations defined there is a vector space over R. 

Example 4.2: Let V be the set of all m X n matrices with entries from an arbitrary field K. Then 
y is a vector space over K with respect to the operations of matrix addition and 
scalar multiplication, by Theorem 3.1. 

Example 4.3: Let V be the set of all polynomials Oo + a^t + Ogt^ + • • ■ + a„t" with coefficienis oj 
from a field K. Then y is a vector space over K with respect to the usual operations 
of addition of polynomials and multiplication by a constant. 

Example 4.4: Let K be an arbitrary field and let X be any nonempty set. Consider the set V of all 
functions from X into K. The sum of any two functions f,g eV is the function 

f + gGV defined by 

{f + g){x) = f(x) + g(x) 

and the product of a scalar kEK and a function / e y is the function kfeV 

defined by , „ ^ 

(kf){x) = kf(x) 



CHAP. 4] 



VECTOR SPACES AND SUBSPACES 



65 



Then V with the above operations is a vector space over K (Problem 4.5). The zero 
vector in V is the zero function which maps each x G X into S K: 0{x) = 
for every x G X. Furthermore, for any function f G V, —f is that function in V 
for which (—/)(») = —f(x), for every x G X. 

Example 45: Suppose S is a field which contains a subfield K. Then E can be considered to be a 
vector space over K, taking the usual addition in E to be the vector addition and 
defining the scalar product kv of kGK and v S jF to be the product of k and v 
as element of the field E. Thus the complex field C is a vector space over the real 
field E, and the real field R is a vector space over the rational field Q. 



SUBSPACES 

Let TF be a subset of a vector space over a field K. W is called a subspace of V if TF is 
itself a vector space over K with respect to the operations of vector addition and scalar 
multiplication on V. Simple criteria for identifying subspaces follow. 

Theorem 4.2: W is& subspace of V if and only if 

(i) W is nonempty, 

(ii) W is closed under vector addition: v,w G W implies v + w G W, 

(iii) W is closed under scalar multiplication: v GW implies kv GW for 
every kGK. 

Corollary 4.3: W ia a subspace of V if and only if (i) GW (or W # 0), and (ii) v,w GW 
implies av + bw G W for every a,b GK. 

Example 4.6: Let V be any vector space. Then the set {0} consisting of the zero vector alone, and 
also the entire space V are subspaces of V. 



Example 4.7: (i) 



Let V be the vector space R^. Then the set W consisting of those vectors whose 
third component is zero, W — {{a,b,0) : a,b GR}, is a subspace of V. 



(ii) Let V be the space of all square nX n matrices (see Example 4.2). Then the 
set W consisting of those matrices A = (oy) for which ay = Ojj, called 
symmetric matrices, is a subspace of V. 

(iii) Let V be the space of polynomials (see Example 4.3). Then the set W consisting 
of polynomials with degree — n, for a fixed n, is a subspace of V. 

(iy) Let V be the space of all functions from a nonempty set X into the real field R. 
Then the set W consisting of all bounded functions in V is a subspace of V. 
(A function / € V is bounded if there exists M GR such that |/(a;)| - M for 
every x G X.) 

Example 4.8: Consider any homogeneous system of linear equations in n unknowns with, say, real 

coefficients: 

aiiXi + a-y^Xi 4- • • • + ai„a;„ = 

a2iXi + a^sx^ + • • • + a2„a;„ — 



Recall that any particular solution of the system may be viewed as a point in R". 
The set W of all solutions of the homogeneous system is a subspace of R" (Problem 
4.16) called the solution space. We comment that the solution set of a nonhomo- 
geneous system of linear equations in n unknowns is not a subspace of R". 



66 



VECTOR SPACES AND SUBSPACES 



[CHAP. 4 



Example 4.9: 



Let V and W be subspaces of a vector space V. We show that the intersection 
Vr\W i& also a subspace of V. Clearly G C/ and S W since U and W are sub- 
spaces; whence e UdW. Now suppose m,v e.Ur\W. Then u,v &U and u,v &W 
and, since U and W are subspaces, 



aw + 6i) G ?7 



and 



aw + 6v e W 



for any scalars a,b€K. Accordingly, au + bv & UnW and so [7nTF is a sub- 
space of V. 

The result in the preceding example generalizes as follows. 

Theorem 4.4: The intersection of any number of subspaces of a vector space 7 is a 
subspace of V. 



LINEAR COMBINATIONS, LINEAR SPANS 

Let F be a vector space over a field K and let vi, ...,VmGV. Any vector in V of the 

form 

aiVi + a2V2 4- • • • + amVm 

where the OiGK, is called a linear combination of vi,...,Vm. The following theorem 
applies. 

Theorem 4.5: Let S be a nonempty subset of V. The set of all linear combinations of 
vectors in S, denoted by L{S), is a subspace of V containing S. Further- 
more, if W is any other subspace of V containing S, then L{S) CW. 

In other words, L{S) is the smallest subspace of V containing S; hence it is called the 
subspace spanned or generated by S. For convenience, we define L{0) = {0}. 

Example 4.10: Let V be the vector space R3. The linear span of any nonzero vector u consists 
of all scalar multiples of u; geometrically, it is the line through the origin and the 
point u. The linear space of any two vectors u and v which are not multiples of 
each other is the plane through the origin and the points u and v. 





Example 4.11: The vectors 6i = (1,0,0), eg = (0,1,0) and eg = (0,0,1) generate the vector space 
R3. For any vector (a, 6, c) G R^ is a linear combination of the ej; specifically, 

(a, b. e) = a(l, 0, 0) + 6(0, 1, 0) + c(0, 0, 1) 
= aej + ftej + 063 

Example 4.12: The polynomials 1, t, t^, t^, ... generate the vector space V of all polynomials 
(in*): y = L(l, t, t^ . . .). For any polynomial is a linear combination of 1 and 
powers of t. 



CHAP. 4] VECTOR SPACES AND SUBSPACES 67 

Example 4.13: Determine whether or not the vector v - (3, 9, -4, -2) is a linear combination of 
the vectors u^ = (1, -2, 0, 3), U2 == (2, 3, 0, -1) and Wg = (2, -1, 2, 1), i.e. belongs 
to the space spanned by the Mj. 

Set r as a linear combination of the Mj using unknowns x, y and z; that is, set 

V = XU^ + J/M2 + ZM3: 

(3, 9, -4, -2) = !B(1, -2, 0, 3) 4 i/(2, 3, 0, -1) + «(2, -1, 2, 1) 
= (x + 2y + 2z, -2x + 3y-z, 2«, 3a; - 3/ + z) 

Form the equivalent system of equations by setting corresponding components equal 
to each other, and then reduce to echelon form: 

X + 2y + 2z = 3 X + 2y + 2z = 3 x + 2y + 2z - 3 

-2x + 3j/ - 2 =:= 9 7y + 3z = 15 7y + 3z - 15 

or or 

22 = -4 2z = -4 22 = -4 

3a; - y + z = -2 -7y - 5z = -11 -2z = 4 

a; + 2|/ + 2z = 3 

or 7y + Bz - 15 

22 = -4 

Note that the above system is consistent and so has a solution; hence v is a linear 
combination of the Mj. Solving for the unknowns we obtain x = 1, y = 3, z — —2. 
Thus V — Ui + 3m2 — 2M3. 

Note that if the system of linear equations were not consistent, i.e. had no solu- 
tion, then the vector v would not be a linear combination of the Mj. 

ROW SPACE OF A MATRIX 

Let A be an arbitrary mxn matrix over a field K: 

(Hi ... ai„ \ 

(I22 . . . a,2n 



\fflml flm2 . . . dmn/ 

The rows of A, 

Rl = (ttll, 0,21, ■ . ., am), . . . , Rm = (Oml, am2, . . . , dmn) 

viewed as vectors in .K", span a subspace of K" called the row space of A. That is, 

row space of A — L{Ri, R2, . . . , Rm) 

Analogously, the columns of A, viewed as vectors in K"", span a subspace of X" called the 
column space of A. 

Now suppose we apply an elementary row operation on A, 

(i) Ri <^ Rj, (ii) Ri ^ kRi, k¥'0, or (iii) Ri -> kRj + Ri 

and obtain a matrix B. Then each row of B is clearly a row of A or a linear combination of 
rows of A. Hence the row space of B is contained in the row space of A. On the other 
hand, we can apply the inverse elementary row operation on B and obtain A; hence the row 
space of A is contained in the row space of B. Accordingly, A and B have the same row 
space. This leads us to the following theorem. 

Theorem 4.6: Row equivalent matrices have the same row space. 

We shall prove (Problem 4.31), in particular, the following fundamental result con- 
cerning row reduced echelon matrices. 



68 VECTOR SPACES AND SUBSPACES [CHAP. 4 

Theorem 4.7: Row reduced echelon matrices have the same row space if and only if they 
have the same nonzero rows. 

Thus every matrix is row equivalent to a unique row reduced echelon matrix called its 
row canonical form. 

We apply the above results in the next example. 

Example 4.14: Show that the space U generated by the vectors 

Ml = (1, 2, -1, 3), M2 = (2, 4, 1, -2), and wg = (3, 6, 3, -7) 
and the space V generated by the vectors 

vi = (1, 2, -4, 11) and v^ = (2, 4, -5, 14) 

are equal; that is, U = V. 

Method 1. Show that each Mj is a linear combination of v^ and V2, and show that 
each Vi is a linear combination of Mj, M2 and M3. Observe that we have to show that 
six systems of linear equations are consistent. 

Method 2. Form the matrix A whose rows are the Mj, and row reduce A to row 
canonical form: 

A = I 2 4 1 -2 I to 



to 



Now form the matrix B whose rows are Vi and t>2, and row reduce B to row canonical 
form: 




1 


2 


-1 


3\ 




1 


2 


-1 


3 








3 


-8 


to 








3 


-8 








6 


-16/ 




\o 











1 


2 





1/3 "^ 


















1 


-8/3 





















/ 













,x 2-4 11\ /I 2 -4 UN /I 2 1/3 

B = ( ) to („ „ o _o) to (j, 1 _8/3 



_ /I 2-4 11\ /I 2-4 UN 

~ \2 4-5 14/ VO 3 -8/ 



Since the nonzero rows of the reduced matrices are identical, the row spaces of A 
and B are equal and so U = V. 

SUMS AND DIRECT SUMS 

Let U and W be subspaces of a vector space V. The sum of U and W, written U + W, 
consists of all sums u + w where uGU and w &W: 

U + W = {u + w:uGU,wGW} 

Note that = + eU + W, since OeU.OGW. Furthermore, suppose u + w and 
u' + w' belong \joU + W, with u,u' GU and w,w' e W. Then 

(u + w) + (u' + w') = {u + u') + {w + w') G U + W 

and, for any scalar k, k{u + w) = ku + kw G U + W 

Thus we have proven the following theorem. 

Theorem 4.8: The sum U + W of the subspaces U and TF of F is also a subspace of V. 

Example 4.15: Let V be the vector space of 2 by 2 matrices over R. Let U consist of those 
matrices in V whose second row is zero, and let W consist of those matrices m V 
whose second column is zero: 



- = {(::)^«'— }■ - = {(::)■«■-■'} 



CHAP. 4] VECTOR SPACES AND SUBSPACES 69 

Now U and W are subspaces of V. We have: 

U+W = {(" o) ■ "'^'"^A *"d VnW = [(I °) : aeR 

That is, U+ W^ consists of those matrices whose lower right entry is 0, and Ur\W 
consists of those matrices whose second row and second column are zero. 

Definition: The vector space V is said to be the direct sum of its subspaces U and W, 

denoted by 

V = V ® w 

if every vector v G F can be written in one and only one way as v = u + w 
where u&V and w gW. 

The following theorem applies. 

Theorem 4.9: The vector space V is the direct sum of its subspaces U and W if and only 
if: {i)V ^ U+W, and (ii) UnW = {0}. 

Example 4.16: In the vector space R^, let U be the xy plane and let W be the yz plane: 

U = {{a, 6, 0) : a, 6 S R} and W = {(0, b,c): h,c& R} 

Then R^ = U+W since every vector in R3 is the sum of a vector in U and a vector 
in W. However, R* is not the direct sum of U and W since such sums are not 
unique; for example, 

(3, 5, 7) = (3, 1, 0) + (0, 4, 7) and also (3, 5, 7) = (3, -4, 0) + (0, 9, 7) 

Example 4,17: In R3, let U be the xy plane and let W be the z axis: 

U =: {(o, 6, 0): a,6GR} and W = {(0, 0, c) : c G R} 

Now any vector (a, b, c) G R^ can be written as the sum of a vector in U and a 
vector in V in one and only one way: 

{a, b, c) = (a, 6, 0) + (0, 0, c) 

Accordingly, R3 is the direct sum of U and W, that is, R^ = U ® W. 



Solved Problems 

VECTOR SPACES 

4.1. Prove Theorem 4.1: Let F be a vector space over a field K. 

(i) For any scalar kGK and GV, fcO = 0. 

(ii) For GK and any vector uGV, Ou = 0. 

(iii) If ku — 0, where kGK and uGV, then fc = or u = 0. 

(iv) For any kGK and any uGV, {-k)u = k{-u) = - ku. 

(i) By axiom [A^] with m = 0, we have + = 0. Hence by axiom [Mi], fcO = fc(0 + 0) = 
fee + fcO. Adding — kO to both sides gives the desired result. 

(ii) By a property of K, + = 0. Hence by axiom [Mg], Om = (0 + 0)m = Qu + Ou. Adding - Om 
to both sides yields the required result. 



70 VECTOR SPACES AND SUBSPACES [CHAP. 4 

(iii) Suppose fcw = and k ¥= 0. Then there exists a scalar fc^i such that fc~ifc = 1; hence 

u = lu = {k-^k)u = k-Hku) = fe-iQ = 

(iv) Using u + {-u) = 0, we obtain = kO = k{u + (-m)) = few + k{-u). Adding -ku to both 
sides gives —ku — k(—u). 

Using k + (-fe) = 0, we obtain = Oit = (fe + (-k))u = ku + (-k)u. Adding -ku to both 
sides yields —ku = {—k)u. Thus (—k)u = k(—u) = —ku. 

4.2. Show that for any scalar k and any vectors u and v, k{u-v) = ku- kv. 

Using the definition of subtraction {u-v = u+ (-■ v)) and the result of Theorem 4.1(iv) 
{k(—v) = —kv), 

k(u -v) = k(u + (-v)) = ku + k(-v) = ku + (-kv) = ku - kv 

4.3. In the statement of axiom [Mz], (a + b)u = au + bu, which operation does each plus 
sign represent? 

The + in (a+b)u denotes the addition of the two scalars a and 6; hence it represents the addi- 
tion operation in the field K. On the other hand, the + in au+ bu denotes the addition of the two 
vectors au and bu; hence it represents the operation of vector addition. Thus each + represents a 
different operation. 

4.4. In the statement of axiom [Ma], (ab)u = a{bu), which operation does each product 

represent? 

In (ab)u the product ab of the scalars a and 6 denotes multiplication in the field K, whereas the 
product of the scalar ab and the vector u denotes scalar multiplication. 

In a{bu) the product bu of the scalar 6 and the vector u denotes scalar multiplication; also, the 
product of the scalar a and the vector bu denotes scalar multiplication. 

4.5. Let V be the set of all functions from a nonempty set X into a field K. For any func- 
tions f.gGV and any scalar k G K, let f + g and kf be the functions in V defined 

as follows: 

{f + 9){x) - fix) + g{x) and (kf){x) = kf(x), yfx G X 

(The symbol V means "for every".) Prove that 7 is a vector space over K. 

Since X is nonempty, V is also nonempty. We now need to show that all the axioms of a vector 
space hold. 

[All- Let f.g.hGV. To show that (f + g) + h = f + (g + h), it is necessary to show that 
the function (f + g) + h and the function f + (g + h) both assign the same value to each 
a; e X. Now, 

((f + g) + h)(x) = if + g){x) + h{x) = (fix) + g{x)) + h(x), Vo; G X 

(f+(g + h))(x) = f(x) + (g + h)(x) = f(x) + (g(x) + h(x)), yfxGX 

But f(x), g(x) and h(x) are scalars in the field K where addition of scalars is associative; hence 

(f(x) + g(x)) + h(x) = f(x) + (g(x) + h(x)) 

Accordingly, (f + g) + h = f +(g + h). 

[AJ: Let denote the zero function: 0(a;) = 0, Va; G X. Then for any function / G V, 
(/ + 0)(a;) = f(x) + 0(a!) = f(x) + = f(x), Vo; G X 
Thus / + = /, and is the zero vector in V. 



CHAP. 4] VECTOR SPACES AND SUBSPACES 71 

[A3]: For any function / G V, let -/ be the function defined by (-/)(«) = - f(x). Then, 

(/ + (-/))(«) = f(x) + (-/)(*) = f(x) - f(x) = = Oix), yfx&X 

Hence / + (-/) = 0. 

[AJ: Let f.g^V. Then 

(/ + ffKx) = f(x) + gix) = g(x) + f(x) = (g + f)(x), y/x&X 

Hence f + g = g + f. (Note that /(*) + g(x) = g(x) + f{x) follows from the fact that /(«) and 
g(x) are scalars in the field K where addition is commutative.) 

[Ml]: Let f,g&V and k & K. Then 

W + 9))i.x) = k((f + g)(x)) = k(f(x) + g(x)) = kf(x) + kg(x) 
= (kf)(x) + (kg)(x) = (kf + kg)(x), ^fxeX 

Hence k(f + g) = kf + kg. (Note that k(f(x) + g{x)) = kf(x) + kg(x) follows from the fact that 
k, f(x) and g(x) are scalars in the field K where multiplication is distributive over addition.) 

[M2]: Let /ey and o, 6 6 X. Then 

((a+6)/)(a;) = (a+h)f(x) = af(x) + hfi^x) = (af)(x) + 6/(a;) 
= (af+hf)(x), VaseX 

Hence (a + 6)/ = af + bf. 

[Mg]: Let f&V and a, 6 G X. Then, 

({ab)f)(x) = (a6)/(x) = a(6/(a;)) = o(6/)(a;) = (a(6/))(a;), Va; G ;f 
Hence (ab)f = a(6/). 

[AfJ: Let / G y. Then, for the unit leK, (!/)(») = l/(a;) = f{x), V« G X. Hence 1/ = /. 
Since all the axioms are satisfied, y is a vector space over K. 

4.6. Let V be the set of ordered pairs of real numbers: V = {{a,b): a,bGR}. Show 
that V is not a vector space over R with respect to each of the following operations 
of addition in V and scalar multiplication on V: 

(i) (a, b) + (c, d) = (a + c,b + d) and k{a, b) = {ka, b); 

(ii) (a, 6) + (c, d) = (a, 6) and k{a, b) = (fee, kb); 

(iii) (a, 6) + (c, d) = (o + c, 6 + d) and fe(a, 6) := (fc^a, fe^ft). 

In each case show that one of the axioms of a vector space does not hold. 
(i) Let r = l, 8 = 2, v = (3, 4). Then 

(r + s)v = 3(3,4) = (9,4) 
rv + sv ^ 1(3, 4) + 2(3, 4) = (3, 4) + (6, 4) = (9, 8) 
Since (r + s)v ¥= rv + sv, axiom [M^] does not hold. 

(ii) Let 0) = (1,2), w = (3,4). Then 

v + w - (1, 2) + (3, 4) = (1, 2) 

w + v = (3, 4) + (1,2) = (3,4) 
Since v + w ¥= w + v, axiom [AJ does not hold. 

(iii) Let r = 1, s = 2, i; = (3, 4). Then 

(r + s)v = 3(3, 4) = (27, 36) 
rv + SV = 1(3, 4) + 2(3, 4) = (3, 4) + (12, 16) = (15, 20) 
Thus {r + s)v ¥= rv + sv, and so axiom [M2] does not hold. 



72 



VECTOR SPACES AND SUBSPACES [CHAP. 4 



SUBSPACES 

4.7. Prove Theorem 4.2: Wis a. subspace of V if and only if (i) W is nonempty, (ii) v,w eW 
implies v + wGW, and (iii) v GW implies kv GW for every scalar kGK. 

Suppose W satisfies (i), (ii) and (iii). By (i), W is nonempty; and by (ii) and (iii), the operations 
of vector addition and scalar multiplication are well defined for W. Moreover, the axioms [A,], [AJ, 
[Ml], [Ma], [Mg] and [MJ hold in W since the vectors in W belong to V. Hence we need only show 
that [A2] and [A3] also hold in W. By (i), W is nonempty, say uGW. Then by (iii), Ou - S W 
and v + = v for every v G W. Hence W satisfies [Ag]. Lastly, it v G W then (-l)v = -v £ W 
and V + (-v) = 0; hence W satisfies [A3]. Thus W is a subspace of V. 

Conversely, if TF is a subspace of V then clearly (i), (ii) and (iii) hold. 

4.8. Prove Corollary 4.3: W ia a subspace of V if and only if (i) e TF and (ii) v,wGW 
implies av + bw GW for all scalars a,b GK. 

Suppose W satisfies (i) and (ii). Then, by (i), W is nonempty. Furthermore, if v,w G W then, 
by (ii), v + w = lv + lweW; and if v&W and kGK then, by (ii), kv = kv + Ove W. Thus 
by Theorem 4.2, Tf^ is a subspace of V. 

Conversely, if W is a subspace of V then clearly (i) and (ii) hold in W. 

4.9. Let y = R^ Show that W is a subspace of V where: 

(i) w = {(a, b,0): a,b G R}, i.e. W is the xy plane consisting of those vectors whose 

third component is 0; 
(ii) W = {{a,b,c): a + b + c = 0}, i.e. W consists of those vectors each with the 

property that the sum of its components is zero. 

(i) = (0, 0,0) e W since the third component of is 0. For any vectors 1; = (a, 6, 0), w = 
(c, d, 0) in W, and any scalars (real numbers) k and k', 
kv + k'w = k(a, b, 0) + k'(c, d, 0) 

= (ka, kb, 0) + (fc'c, k'd, 0) = (ka + k'c, kb + k'd, 0) 

Thus kv + k'w e W, and so W is a subspace of V. 

(ii) = (0, 0,0) GW since + + = 0. Suppose v = (a, b, c), w = (a', b', e') belong to W, i.e. 
tt + 6 + c = and a' + 6' + C = 0. Then for any scalars k and k', 
kv + k'w = k(a, b, c) + k'(a', b', c') 

= (ka, kb, kc) + {k'a', k'b', k'c') 
= (ka + k'a', kb + k'b', kc + k'c') 
and furthermore, 

(ka + k'a') + (kb + k'b') + (kc + k'c') = k(a+ b + c) + k'{a' + b' + e') 

= fcO + fc'O = 
Thus kv + k'w e W, and so W is a subspace of V. 

4.10. Let V = R^ Show that W is not a subspace of V where: 

(i) PF = {{a, b,c): a ^ 0}, i.e. W consists of those vectors whose first component is 

nonnegative; 
(ii) Pf = {(a, b, c): d' + b^ + c^^ 1}, i.e. W consists of those vectors whose length does 

not exceed 1; 
(iii) W = {(a, 6, c) : a, b, c e Q}, i.e. W consists of those vectors whose components are 

rational numbers. 

In each case, show that one of the properties of, say. Theorem 4.2 does not hold. 
(i) ,, = (1,2,3) GW and fc = -5 e R. But fc. = -5(1,2,3) = (-5, -10,-15) does not belong to 

W since -5 is negative. Hence W is not a subspace of V. 



CHAP. 4] VECTOR SPACES AND SUBSPACES 73 

(ii) V = (1, 0,0) eW and w = (0, 1, 0)eW. But v + w = (1, 0, 0) + (0, 1, 0) = (1, 1, 0) does not 
belong to W since 1^ + 1^ + 0^ = 2 > 1. Hence W Is not a subspace of V. 

(iii) v = (1,2,3) GW and k = y/2GK. But fcr = \/2 (1,2,3) = (\/2, 2V2, 3\/2) does not belong to 
W since its components are not rational numbers. Hence W is not a subspace of V. 

4.11. Let V be the vector space of all square nxn matrices over a field K. Show that W 
is a subspace of V where: 

(i) W consists of the symmetric matrices, i.e. all matrices A = (otj) for which 

(ii) W consists of all matrices which commute with a given matrix T; that is, 
W= {AgV: AT = TA}. 

(i) OSW since all entries of are and hence equal. Now suppose A = (ay) and B = (6y) 
belong to W, i.e. ftjj = ay and 5jj = 6y. For any scalars a, 6 G if , aA + bB is the matrix 
whose ii-entry is aa^ + 66y. But aa^j + 66ji = aoy + 66y. Thus aA + 6B is also symmetric, 
and so TF is a subspace of V. 

(ii) OeW since or = = TO. Now suppose A,BgW; that is, AT - TA and BT = TB. For 
any scalars a,b G K, 

{aA + bB)T = (aA)T + {bB)T = a(AT) + b(BT) = a{TA) + b(TB) 

= T(aA) + T{hB) = r(aA + 5B) 

Thus aA + 6B commutes with T, i.e. belongs to W; hence W is a subspace of V. 



4.12. Let V be the vector space of all 2 x 2 matrices over the real field R. Show that W 
is not a subspace of V where: 

(i) W consists of all matrices with zero determinant; 
(ii) W consists of all matrices A for which A^ = A. 

(i) (Recall that det( ^ = ad — be.) The matrices A = f ) and B = f j belong 

to W since det(A) = and det(B) = 0. But A + B = f j does not belong to W since 

det (A + B) = 1. Hence W is not a subspace of V. 



<ll) 



(ii) The unit matrix i = ( » , ) belongs to W since 



- = i'l X n = (' ") = ^ 



/1 

Vo 1 



/2 0\ 
But 2/ = ( I does not belong to W since 

Hence W is not a subspace of V. 



4 
4 



¥^ 2/ 



4.13. Let V be the vector space of all functions from the real field R into R. Show that W 
is a subspace of V where: 
(i) w = {f: /(3) = 0}, i.e. W consists of those functions which map 3 into 0; 

(ii) W = {f: /(7) = /(!)}, i.e. W consists of those functions which assign the same 

value to 7 and 1; 
(iii) W consists of the odd functions, i.e. those functions / for which /(-«) = - /(«)• 



74 VECTOR SPACES AND SUBSPACES [CHAP. 4 

Here denotes the zero function: 0(a;) = 0, for every x & R. 

(i) OeW since 0(3) = 0. Suppose f.gGW, i.e. /(3) = and sr(3) = 0. Then for any real 

numbers a and b, 

(af + bg)(3) = (1/(3) + bg{3) = aO + 60 = 

Hence af + bg & W, and so TF is a subspace of V. 

(11) OeW since 0(7) = = 0(1). Suppose f.g^W, I.e. /(7) = /(I) and «r(7) = flr(l). Then, for 
any real numbers a and b, 

(af+bg){7) = af(7) + bg(l) = a/(l) + 6ff(l) = (a/+6fl-)(l) 

Hence af + bg & W, and so W is a subspace of V. 

(ill) OeW since 0(-a;) = = -0 = -0(a;). Suppose f,g&W, i.e. /(-x) = -/(«) and g{-x) = 
— g{x). Then for any real numbers a and b, 

(a/+6f?)(-a!) = a/(-a;) + 6flr(-a;) = - af(x) - bg{x) = - (a/(x) + 6flr(a;)) = -(af + bg)(x) 
Hence af + bg €: W, and so W is a subspace of V. 

4.14. Let V be the vector space of all functions from the real field R into R. Show that W 
is not a subspace of V where: 

(i) W={f: /(7) = 2 + /(!)}; 

(ii) W consists of all nonnegative functions, i.e. all function / for which f{x) ^ 0, 

yfxGR. 

(i) Suppose f.geW, I.e. /(7) = 2 + /(l) and flr(7) = 2 + flr(l). Then 
(f + g)i^) = /(7) + fl-C?) = 2 + /(I) + 2 + flr(l) 

= 4 + /(I) + ir(l) = 4 + (/ + flr)(l) ^ 2 + (/ + sf)(l) 
Hence f + g^W, and so Tl' is not a subspace of V. 
(11) Let fc = -2 and let / G V be defined by /(a) = x^. Then / G W since /(«) = a;2 s= Q, 
V« e R. But (fc/)(5) = fe/(5) = (-2)(52) = -50 < 0. Hence kf € W, and so W Is not a sub- 
space of V. 

4.15. Let V be the vector space of polynomials ao + ait + a^t^ + • • • + a„f" with real coef- 
ficients, i.e. Oi e R. Determine whether or not VF is a subspace of V where: 

(i) W consists of all polynomials with integral coefficients; 
(ii) W consists of all polynomials with degree — 3; 

(iii) W consists of all polynomials &o + &it^ + h^t^ + • • • + bnt^", i.e. polynomials with 
only even powers of t. 

(1) No, since scalar multiples of vectors in W do not always belong to W. For example, v = 
3 + 5t + 7(2 e ^ but ^i' = f + |« + 1*^ ^ W. (Observe that W is "closed" under vector 
addition, i.e. sums of elements in W belong to W.) 

(ii) and (iii). Yes. For, in each case, W is nonempty, the sum of elements in W belong to W, and 
the scalar multiples of any element in W belong to W. 

4.16. Consider a homogeneous system of linear equations in n unknowns Xi, . ..,Xn over a 

field K: , ^ n 

anXi + ai2X2 + • ■ ■ + ainXn = 

aziXl + 022*2 + • ■ • + a2nX„ = 



OmlXl + am2X2 + • • • + dmnXn = 

Show that the solution set W is a subspace of the vector space K". 
= (0, 0, . . . , 0) e PF since, clearly, 

a«0 + ajjO + • • • + ttinO = 0, for t = 1, . . .,m 



CHAP. 4] VECTOR SPACES AND SUBSPACES 75 

Suppose M = («!, M2, • • . , M„) and v = (vj, Vg, . . . , v„) belong to W, i.e. for i — I, . . .,Tn 

«il"l + ai2W2 + • ■ • + ttinMn = 

OjiVi + ai2'y2 + • • • + ai„v„ = 

Let a and 6 be scalars in K. Then 

au + bv = (ciMx + 6^1, au2 + 6^2. • ■ • > <*w„ + 6v„) 
and, for i = 1, . . . , m, 

aji(aMi + bvj) + ai2(au2 + 6f 2) + • " • + ain(aMn + bv^) 

= o(ajiMi + OJ2M2 + • • • + «!„«*„) + 6(ajii;i + (ij2i;2 + • • • + aini'n) 

= aO + 60 = 

Hence au + 6v is a solution of the system, i.e. belongs to W. Accordingly, W is a subspace of K". 

LINEAR COMBINATIONS 

4.17. Write the vector v = (1, —2, 5) as a linear combination of the vectors ei = (1, 1, 1), 
62 = (1,2, 3) and 63 = (2, -1,1). 

We wish to express v as v = xei + 3/62 + ze^, with x, y and z as yet unknown scalars. Thus 

we require 

(1, -2, 5) = x{l, 1, 1) + j/(l, 2, 3) + z(2, -1, 1) 

= (x, X, x) + (y, 2y, 3y) + (22, -z, z) 

= (a; + 2/ + 2z, a; + 2j/ — z, a; + 32/ + 2) 

Form the equivalent system of equations by setting corresponding components equal to each other, 
and then reduce to echelon form: 

x+ y + 2z = 1 x + y + 2z = 1 x + y + 2z = 1 

X + 2y — z = —2 or j/ - 3z = -3 or y - Sz = S 

x + 3y + z = 5 2y - z = 4, 5z=10 

Note that the above system is consistent and so has a solution. Solve for the unknowns to obtain 
X = —6, y = B, z — 2. Hence v = — 6ei + 862 + 263. 

4.18. Write the vector v = (2, -5, 3) in R^ as a linear combination of the vectors Ci = 
(1,-3,2), 62 = (2, -4,-1) and 63 = (1,-5, 7). 

Set V as a linear combination of the Cj using the unknowns x, y and z: v = xe^ + j/eg + zeg. 

(2, -5, 3) = x{\, -3, 2) + y{2, -4, -1) + z(l, -5, 7) 

= {x + 2y + z, -3x -4y-5z,2x-y + 7z) 

Form the equivalent system of equations and reduce to echelon form: 

x + 2y+z=2 x + 2y+z-2 x + 2y + z = 2 

-3x - 4y - 5z = -5 or 2y - 2z = 1 or 2^ - 2z = 1 

2x — y + Tz = 3 -5y + 5z = -1 = 3 

The system is inconsistent and so has no solution. Accordingly, v cannot be written as a linear com- 
bination of the vectors Ci, e^ and 63. 

4.19. For which value of k will the vector u = (1, -2, k) in R" be a linear combination of 
the vectors v = (3, 0, -2) and w = (2, -1, -5) ? 

Set u = XV + yw: 

(1, -2, fe) = a(3, 0, -2) + j/(2, -1, -5) = (3a; + 2y, -y, -2x - 5y) 

Form the equivalent system of equations: 

3x + 2y = 1, -y = -2, -2x - 5y = k 
By the first two equations, x = —1, j/ = 2. Substitute into the last equation to obtain k = —8. 



76 VECTOR SPACES AND SUBSPACES [CHAP. 4 

4^0. Write the polynomial v = t^ + 4t — 3 over R as a linear combination of the poly- 
nomials ei = t^-2t + 5, 62 = 2t^ - St and ca = t + S. 

Set D as a linear combination of the ej using the unknowns x, y and z: v = xe^ + ye^ + 263. 
t2 + 4t - 3 = a;(t2-2t + 5) + 3/(2«2-3t) + 2(f + 3) 

= a;t2 - 2xt + 5a; + 22/t2 - s^/t + zt + 3z 
- {x + 2y)fi + {-2x-3y + z)t + (5a; + 3z) 
Set coefficients of the same powers of t equal to each other, and reduce the system to echelon form: 
x + 2y = 1 X + 2y = 1 x + 2y = 1 

—2x — 3j/+z=4 or 2/+z=6 or 2/ + z=6 

hx + 3z = -3 -IQy + 3z = -8 13z = 52 

Note that the system is consistent and so has a solution. Solve for the unknowns to obtain 
a; = -3, 2/ = 2, z = 4. Thus v - -Ssj + ie^ + 463. 



4.21. Write the matrix E — I j as a linear combination of the matrices A = 

;j).-a?)--(:-x 

Set £ as a linear combination of A,B,C using the unknowns x,y,z: E — xA + yB + zC. 
3 IN /I 1\ /O 0\ /O 2 



1 _iy = ^'(i oy + \i 1) + ^o -1 

X iKN,/0 0N/0 2z\ / X X + 2z 



/O 22 
\0 -s 



X 0/ \y VJ \0-«/ \a; + 2/ y — z 

Form the equivalent system of equations by setting corresponding entries equal to each other: 

a; = 3, X + y — 1, x + 2z = 1, y — z = —1 

Substitute a; = 3 in the second and third equations to obtain y = —2 and z = —1. Since these 
values also satisfy the last equation, they form a solution of the system. Hence E = BA — 2B — C. 



4.22. Suppose m is a linear combination of the vectors Vi, . . .,Vm and suppose each Vi is a 
linear combination of the vectors Wi, . . . , Wn- 

u = aiVi + a2V2 + • • • + OmVm and Vi = haWi + baWi + ■ • • + bi„w„ 
Show that u is also a linear combination of the wu Thus if ScL{T), then L{S) (ZL{T). 

u = a^Vi + a^v^ + • • • + a^v^ 

= ai(6iiWi + • • • + 6i„W„) + 02(62l"'l + • ■ • + b2nWn) + • • ' + a^(h.ml1»l + • • • + frmn'"'™) 

= (diftji + (12621 + • • ■ + am6mi)wi + • • • + (ai6i„ + a^h^n + • • • + aJi^^Wn 

m m / n \ n / 7n \ 

or simply u - "S, ^in = 2 ^i ( 2 h'^j ) = 2 ( 2 cuba ) Wj 

i=l i=l \3=1 / 3=1 \i=l / 



LINEAR SPANS, GENERATORS 

4.23. Show that the vectors u = (1, 2, 3), v = (0, 1, 2) and w = (0, 0, 1) generate W. 

We need to show that an arbitrary vector (a, 6, c) S R3 is a linear combination of u, v and w. 

Set (a, 5, c) = xu + yv + zw: 

(a, b, c) = x(l, 2, 3) + 1/(0, 1, 2) + z(0, 0, 1) = {x, 2x + y,Sx + 2y + z) 



CHAP. 4] VECTOR SPACES AND SUBSPACES 77 

Then form the system of equations 

X = a « + 2i/ + 3a;=:c 

2x + y =6 or y + 2x = h 

Zx + 2y + z = c X = a 

The above system is in echelon form and is consistent; in fact x — a, y — b — 2a, z = e — 2b + a 
is a solution. Thus u, v and w generate R^. 

4.24. Find conditions on a, b and c so that (a, b, c) G W belongs to the space generated by 
u = (2, 1, 0), V = (1, -1, 2) and w = (0, 3, -4). 

Set (a, 6, c) as a linear combination of u, v and w using unknowns x, y and z: (a, b, c) = 

xu + yv + zw. 

{a, b, e) = x(2, 1, 0) + j/(l, -1, 2) + z{0, 3, -4) = {2x + y,x-y + Sz, 2y - 4z) 

Form the equivalent system of linear equations and reduce it to echelon form: 

2x + y =a 2x + y = a 2x + y =a 

X — y + Zz — b or 3j/ - 6z = a - 26 or Zy - ^z = a - 2b 

2y - Az = c 2y - iz = c = 2a - 46 - 3c 

The vector (a, b, c) belongs to the space generated by u, v and w if and only if the above system is 
consistent, and it is consistent if and only if 2a - 46 - 3c = 0. Note, in particular, that u, v and 
w do not generate the whole space R^. 

4.25. Show that the xy plane W = {(a, b, 0)} in R^ is generated by u and v where: (i) u = 
(1, 2, 0) and v = (0, 1, 0); (ii) u = (2, -1, 0) and v = (1, 3, 0). 

In each case show that an arbitrary vector (a, b,0)eW is a linear combination of u and v. 

(i) Set (a, b, 0) = xu + yv: 

{a, b, 0) = x(l, 2, 0) + y(0, 1, 0) = (x, 2x + y, 0) 

Then form the system of equations 

X = a y + 2x — b 

2x + y = b or x = a 

= 
The system is consistent; in fact x = a, y = b-2a is a solution. Hence u and v generate W. 

(ii) Set (o, 6, 0) = xu + yv: 

{a, 6, 0) = x{2, -1, 0) + y(l, 3, 0) = (2x + y,-x + 3y, 0) 
Form the following system and reduce it to echelon form: 

2x + y = a 2x + y = a 

—x + Sy = b or 7y - a + 2b 

= 

The system is consistent and so has a solution. Hence W is generated by u and v. (Observe 
that we do not need to solve for x and y; it is only necessary to know that a solution exists.) 

4.26. Show that the vector space V of polynomials over any field K cannot be generated by 
a finite number of vectors. 

Any finite set S of polynomials contains one of maximum degree, say m. Then the linear span 
L{S) of S cannot contain polynomials of degree greater than m. Accordingly, V f^ L{S), for any 
finite set S. 



78 VECTOR SPACES AND SUBSPACES [CHAP. 4 

4.27. Prove Theorem 4.5: Let S be a nonempty subset of V. Then L{S), the set of all 
linear combinations of vectors in S, is a subspace of V containing S. Furthermore, if 
W is any other subspace of V containing S, then L{S) C W. 

If V G S, then Iv = v G L{S); hence S is a subset of L{S). Also, L(S) is nonempty since S is 
nonempty. Now suppose v,wGL(S); say, 

V — ai^Ui + • • ■ + a^nVm and w — b^Wi + • • ■ + 6„w„ 

where Vi, w^ G S and fflj, 5j are scalars. Then 

V + w = a^vi + ■ • • + a^v^ + b^Wi + • • • + 6„w„ 

and, for any scalar k, 

kv - A;((iii;i + • • • + a^v^ - ka^v^ + • • • + ka^v^ 

belong to L(S) since each is a linear combination of vectors in S. Accordingly, L(S) is a subspace 
of V. 

Now suppose W is a subspace of V containing S and suppose v^, . . .,v^E. S cW. Then all 

multiples a^Vi Om'^'m ^ ^> "where Oj G K, and hence the sum a^v^ + • • • + a^Vm ^ ^- That 

is, W contains all linear combinations of elements of S. Consequently, L(S) c T^ as claimed. 



ROW SPACE OF A MATRIX 

4.28. Determine whether the following matrices have the same row space: 

Row reduce each matrix to row canonical form: 

^ - r. z z) to (J ; \) to (; I I 

B = f I ; r ) to (' ~ ~ ) to 




c = 



(^ 


1 


5 


3 


13 


(s 


-1 


-2 


-2 


-3 


/I 


-1 


-l' 


4 


-3 


-1 


V 


-1 


3/ 



Vo 1 



3> 



to I 1 3 to I 1 3 I to 
2 6/ 





Since the nonzero rows of the reduced form of A and of the reduced form of C are the same, 
A and C have the same row space. On the other hand, the nonzero rows of the reduced form of B 
are not the same as the others, and so B has a different row space. 



4.29. Consider an arbitrary matrix A = (a«). Suppose u = (&i, ...,&«) is a linear com- 
bination of the rows Ri, .. .,Rm of A; say u = kiRi + • • • + kmRm. Show that, for each 
i, bi — kiaii + feozi + • • • + kmOmi where an, . . ., Omi are the entries of the ith column 
of A. 

We are g:iven u = ftjiJi + • • • + k^^R^; hence 

(6i K) = fci(an. • • • . Om) + • • • + fem(ami. • • • . O-mn) 

= (feifflii + • • • + fc^Oml) • • • > ^^l^ml + • ■ • + kmO'mn) 

Setting corresponding components equal to each other, we obtain the desired result. 



CHAP. 4] 



VECTOR SPACES AND SUBSPACES 



79 



4.30. Prove: Let A = (an) be an echelon matrix with distinguished entries aij^, a^^, . . ., ttrj,, 
and let B — (&«) be an echelon matrix with distinguished entries bik^, &2fc2» • • • . bsk;. 



Olj ****** 



\ 



A = 



a2i. 






a,v, 



/ b-i 



B 



^ 4i :]c ^ ifc 4i 9|i 

Osfc ♦ ^ ♦ V v 



Suppose A and B have the same row space. Then the distinguished entries of A and 
of B are in the same position: ji — fci, 32 = kz, . . ., jr = kr, and r = s. 

Clearly i4 = if and only if B = 0, and so we need only prove the theorem when r — 1 
and s — 1. We first show that ij = k^. Suppose ii < k^. Then the j^th. column of B is zero. 
Since the first row of A is in the row space of B, we have by the preceding problem, Oi^^ = 
CiO + C2O + • • • + cji = for scalars Cj. But this contradicts the fact that the distinguished 
element a^ # 0. Hence Ji — k^, and similarly fei — ij. Thus j'l = fcj. 

Now let A' be the submatrix of A obtained by deleting the first row of A, and let B' be the 
submatrix of B obtained by deleting the first row of B. We prove that A' and B' have the same 
row space. The theorem will then follow by induction since A' and B' are also echelon matrices. 

Let R = («!, 02, . . .,a„) be any row of A' and let R^, ...,B,n ^^ the rows of B. Since R is in 
the row space of B, there exist scalars d^, . . .,d^ such that R — diRi + ^2^2 + • • • + dmRm- Since 
A is in echelon form and R is not the first row of A, the ^ith entry of R is zero: aj = for 
i = ;j = fej. Furthermore, since B is in echelon form, all the entries in the fcjth column of B are 
except the first: 61^.^ ¥= 0, but 62^1 = ^> ■■■> ^rnkj = 0. Thus 



= Ofcj = difeifcj + dgO + • • • + d„0 



d,b. 



Now 6ifc # and so di = 0. Thus B is a linear combination of R^, . . .,Bm and so is in the row 
space of B'. Since R was any row of A', the row space of A' is contained in the row space of B'. 
Similarly, the row space of B' is contained in the row space of A'. Thus A' and B' have the same 
row space, and so the theorem is proved. 



4.31. Prove Theorem 4.7: Let A = {ckj) and B = (&«) be row reduced echelon matrices. 
Then A and B have the same row space if and only if they have the same nonzero rows. 

Obviously, if A and B have the same nonzero rows then they have the same row space. Thus 
we only have to prove the converse. 

Suppose A and B have the same row space, and suppose R ¥= is the ith row of A. Then there 
exist scalars Cj, . . . , c^ such that 

R - CiRi + C2R2 + • • • + c^Rs W 

where the Ri are the nonzero rows of B. The theorem is proved if we show that R — R^, or 
Cj = 1 but Cfc = for A; # t. 

Let ay be the distinguished entry in R, i.e. the first nonzero entry of R. By (1) and Problem 4.29, 

««i = Cl&ljj + «262Ji + • • • + C,b,j. (2) 

But by the preceding problem 6y. is a distinguished entry of B and, since B is row reduced, it is 
the only nonzero entry in the ijth column of B. Thus from (2) we obtain Oy^ = Cjfty.. However, 
oy = 1 and 6y = 1 since A and B are row reduced; hence Cj = 1. 

Now suppose k ¥= i, and b^j. is the distinguished entry in R^. By (i) and Problem 4.29, 



a„. = Cibij^ + Hb2j^ + 



"«fc 



+ C,b,j^ 



(S) 



80 VECTOR SPACES AND SUBSPACES [CHAP. 4 

Since B is row reduced, bj^j^ is the only nonzero entry in the i^th column of B; hence by (3), 
"ijfc — OkHj,^- Furthermore, by the preceding problem a,.j is a distinguished entry of A and, since 
A is row reduced, a^^ = 0. Thus c^b^j^ = and, since b^j^ = 1, c,, = 0. Accordingly R = R^ 
and the theorem is proved. 

4.32. Determine whether the following matrices have the same column space: 

/l 3 5\ 112 3^ 

A = 1 4 3 , B = -2 -3 -4 
\1 1 9/ \ 7 12 17y 

Observe that A and B have the same column space if and only if the transposes A* and B* have 
the same row space. Thus reduce A' and J5' to row reduced echelon form: 



A' = 3 4 1 to 1 -2 to 1 -2 to 





1 


1 


1 





1 


-2 








0/ 


'l 


-2 


7 





1 


-2 













B* = 2 -3 12 to 1 -2 to 1 -2 to 
\s -4 17/ \0 2 -4/ \0 0/ 

Since A* and B* have the same row space, A and B have the same column space. 

4.33. Let jR be a row vector and B a matrix for which RB is defined. Show that RB is a 
linear combination of the rows of B. Furthermore, if A is a matrix for which AB is 
defined, show that the row space of AB is contained in the row space of B. 

Suppose i? = (aj, ttg, . . . , a^) and B = (6y). • Let 5i, ...,B^ denote the rows of B and 
Bi, . . . , B» its columns. Then 

RB = {R'B^.R'B^, ...,R'B^) 

= (aibii + 02621 + ^- ttm^ml. ai*12 + 02*22 + • ■ • + am&m2. ■ ■ • . «l&ln + 02*2n ^ 1" am&mn) 

= ai(6ii, 612, . . . , 6i„) + a2(b2i, 622 &2n) + • • • + am(&ml. 6m2. • • • . bmn) 

= a^Bi + a252 + • • • + amBm 
Thus fijB is a linear combination of the rows of B, as claimed. 

By Problem 3.27, the rows of AB are RiB where i?j is the tth row of A. Hence by the above 
result each row of AB is in the row space of B. Thus the row space of AB is contained in the row 
space of B. 

SUMS AND DIRECT SUMS 

4.34. Let U and W be subspaces of a vector space V. Show that: 

(i) U and W are contained in JJ + W; 

(ii) U + W \& the smallest subspace of V containing U and W, that is, U + W is the 
linear span of C/ and PF: C/ + W^ = L(C7, W). 

(i) Let M G [/. By hypothesis TF is a subspace of V and so d &W. Hence m = m + OS J7+W. 
Accordingly, U is contained in U +W. Similarly, W is contained in U + W. 

(ii) Since 17 + W is a subspace of V (Theorem 4.8) containing both U and Tl', it must also contain 
the linear span of V and W: L{JJ,W) (ZU + W. 

On the other hand, if v GU +W then v — u + w = lu+lw where uGU and w &W\ 
hence v is a linear combination of elements in UuW and so belongs to L{'U,W). Thus 
U + W (Z L(U, W). 

The two inclusion relations give us the required result. 



CHAP. 4] VECTOR SPACES AND SUBSPACES 81 

4.35. Suppose U and W are subspaces of a vector space V, and that {im} generates U and 
{Wj) generates W. Show that {Ui,W}), i.e. {Vn} U {Wj), generates U + W. 

Let V ^U +W. Then v = u-\-w where u G U and w G W. Since {mJ generates ?7, tt is a 
linear combination of Mj's; and since {Wj} generates W, w is a linear combination of Wj's: 

u = ffi^j + ciiUi + • • ■ + a„Mj , ttj G K 

w = b^Wj^ + 62WJ2 + • • • + K'Wj^, bj e K 

Thus V = u + w = ttiMj + a2Ui + • ■ • + a^Mj + biW, + b2Wj + • • • + fc^Wj 
and so {mj, w^} generates i7 + TF. 

4.36. Prove Theorem 4.9: The vector space V is the direct sum of its subspaces U and W 
if and only ii (i) V = U + W and (ii) UnW = {0}. 

Suppose V = U ® W. Then any 1; G V can be uniquely written in the form v = u + w 
where uG U and w G W. Thus, in particular, V = U + W. Now suppose v G UnW. Then: 

(1) V - V + where vGU, G W; and (2) v = + v where OGU, v GW 
Since such a sum for ■« must be unique, v — 0. Accordingly, UCiW = {0}. 

On the other hand, suppose V - U +W and UnW = {0}. Let vGV. Since V = U + W, 
there exist uG U and w S W such that v = u + w. We need to show that such a sum is unique. 
Suppose also that v = u' + w' where u' G U and w' G W. Then 

u + w = u' + w' and so u — u' = w' — w 
But u-u'GU and w' - w 6 W; hence by UnW = {0}, 

u — u' — 0, w' — w = and so u = u', w = w' 
Thus such a sum for f G V is unique and V = C7 ® TF. 

4.37. Let U and PT be the subspaces of R^ defined by 

U = {{a, b,c): a = b = c} and W = {(0, 6, c)} 
(Note that PF is the yz plane.) Show that R^ = U ® W. 

Note first that JJnW' = {0}, for v = (a, b, c) G UnW implies that 

a = b = c and a = which implies a = 0, 6 = 0, c = 
i.e. V = (0, 0, 0). 

We also claim that R^ = U + W. For if 1; = (a, 6, c) S RS, then v = (a, a, a) + (0,b-a,c- a) 
where (a, a, a) S C7 and (,0,b - a, c- a) GW. Both conditions, UnW = {0} and R3 = U + W, 
imply R3 = C7 TF. 

4.38. Let V be the vector space of ti-square matrices over a field R. Let U and W be the 
subspaces of symmetric and antisymmetric matrices, respectively. Show that 
V = U ® W. (The matrix M is symmetric ifi M - M*, and anti-symmetric iff 
M* = -M.) 

We first show that V = U + W. Let A be any arbitrary w-square matrix. Note that 

A = ^(A + A*) + i(A - At) 

We claim that |(A + A') G 17 and that ^(A - A«) G W. For 

{^{A+At)y = i(A+A«)' = i(A« + A«) = ^{A+A') 
that is, -J^CA + A') is symmetric. Furthermore, 

(^(A-At))' = i(A-A')' :- i(At-A) = -^(A-A'^) 
that is, J(A — A') is antisymmetric. 

We next show that UnW = {0}. Suppose MGUnW. Then M = M' and M^ = -M which 
implies M = -M or M = 0. Hence I7nW = {0}. Accordingly, V= U®W. 



82 VECTOR SPACES AND SUBSPACES [CHAP. 4 

Supplementary Problems 

VECTOR SPACES 

4.39. Let y be the set of infinite sequences (a^, a^, . . .) in a field K with addition in V and scalar multi- 
plication on V defined by 

(«!, a2. •••) + (6i, &2. •••) = (ai + 6i, 02+ 62, . . .) 

k(ai, 02, . . . ) = (fcaj, ka2, . . . ) 

where aj, bj, k G K. Show that V is a vector space over K. 

4.40. Let V be the set of ordered pairs (a, 6) of real numbers with addition in V and scalar multiplication 

on V defined by 

(a, 6) + (c, d) = {a+ c,b + d) and k{a, b) = {ka, 0) 

Show that V satisfies all of the axioms of a vector space except [AfJ : lu = u. Hence [^4] is not a 
consequence of the other axioms. 

4.41. Let V be the set of ordered pairs (a, b) of real numbers. Show that V is not a vector space over R 
with addition in V and scalar multiplication on V defined by: 

(i) {a,b) + (c,d) = {a + d,b + c) and k(a, b) = (ka, kb); 

(ii) (c, 6) + (c, d) = (a + c,b + d) and k(a, b) = (a, 6); 

(iii) (a, b) + (c, d) = (0, 0) and k{a, b) - {ka, kb); 

(iv) (a, 6) + (c, d) = (ac, bd) and k{a, b) = (ka, kb). 

4.42. Let V be the set of ordered pairs (zi, z^) of complex numbers. Show that V is a vector space over the 
real field R with addition in V and scalar multiplication on V defined by 

(zj, Z2) + (wi, W2) = («i + Wi, «2 + '"'2) and k{zi, 22) = {kzi, kz2) 
where z^, Z2, Wi, W2 ^ C and k GB,. 

4.43. Let y be a vector space over K, and let F be a subfield of K. Show that V is also a vector space 
over F where vector addition with respect to F is the same as that with respect to K, and where 
scalar multiplication by an element k G F is the same as multiplication by k as an element of K. 

4.44. Show that [A4], page 63, can be derived from the other axioms of a vector space. 

4.45. Let U and W be vector spaces over a field K. Let V be the set of ordered pairs (u, w) where u 
belongs to U and w to W: V = {{u, w) : uGU, w G W}. Show that y is a vector space over K 
with addition in V and scalar multiplication on V defined by 

(u, w) + (u', w') = (u + u',w + w') and k(u, w) = (ku, kw) 

where u, u' G U, w,w' GW and k G K. (This space V is called the external direct sum of U 
and W.) 

SUBSPACES 

4.46. Consider the vector space V in Problem 4.39, of infinite sequences (a^, 02, . . .) in a field K. Show 
that TF is a subspace of V if: 

(i) W consists of all sequences with as the first component; 

(ii) W consists of all sequences with only a finite number of nonzero components. 

4.47. Determine whether or not W is a subspace of RS if W consists of those vectors (a, b, c) G RS for 
which: (i) a = 26; (ii) a ^ b ^ c; (iii) ab = 0; (iv) a = b - c; (y) a ^ 6^; (vi) kia + kib + kgfi = 0, 
where k^ S R. 

4.48. Let y be the vector space of w-square matrices over a field K. Show that T^ is a subspace of V if 
W consists of all matrices which are (i) antisymmetric (A« = -A), (ii) (upper) triangular, 
(iii) diagonal, (iv) scalar. 



CHAP. 4] VECTOR SPACES AND SUBSPACES 83 

4.49. Let AX = B be a nonhomogeneous system of linear equations in n unknowns over a field K. 
Show that the solution set of the system is not a subspace of K". 

4.50. Let V be the vector space of all functions from the real field R into R. Show that W is a subspace 
of V in each of the following cases. 

(i) W consists of all bounded functions. (Here / : R -^ R is bounded if there exists Af G R such 
that |/(a;)| ^ M, Va; G R.) 

(ii) W consists of all even functions. (Here / : R -* R is even if /(— «) = f{x), Va; G R.) 

(iii) W consists of all continuous functions. 

(iv) W consists of all difTerentiable functions. 

(v) W consists of all integrable functions in, say, the interval — a; — 1. 

(The last three cases require some knowledge of analysis.) 

4.51. Discuss whether or not R^ is a subspace of R^. 

4.52. Prove Theorem 4.4: The intersection of any number of subspaces of a vector space V is a subspace 

of y. 

4.53. Suppose U and W are subspaces of V for which UuW is also a subspace. Show that either 
UcW or WcU. 

LINEAR COMBINATIONS 

4.54. Consider the vectors u = (1, -3, 2) and v = (2, -1, 1) in R3. 
(i) Write (1, 7, —4) as a linear combination of u and v. 

(ii) Write (2, —5, 4) as a linear combination of u and v. 

(iii) For which value of k is (1, k, 5) a linear combination of u and vt 

(iv) Find a condition on a, b and c so that (a, 6, e) is a linear combination of u and v. 

4.55. Write m as a linear combination of the polynomials v = 2t^ + 3t — i and w = t^-2t—Z where 
(i) M = 3*2 + 8« - 5, (ii) M = 4<;2 - 6t - 1. 

4.56. Write B as a linear combination of ^ -((._■,) > ^ = (_j n) *"d *^ ~ ( ) 
where: (i) E = Q "^ ; (ii) E = (J _l^ . 

LINEAR SPANS, GENERATORS 

4.57. Show that (1, 1, 1), (0, 1, 1) and (0, 1, —1) generate R*, i.e. that any vector (a, b, e) is a linear com- 
bination of the given vectors. 

4.58. Show that the yz plane W = {(0, b, c)} in R* is generated by: (i) (0, 1, 1) and (0, 2, -1); (ii) (0, 1, 2), 
(0, 2, 3) and (0, 3, 1). 

4.59. Show that the complex numbers w = 2 + 3t and z -l — 2i generate the complex field C as a 
vector space over the real field R. 

4.60. Show that the polynomials {l-t)», (1 - 1)^, 1 — t and 1 generate the space of polynomials of 
degree — 3. 

4.61. Find one vector in R3 which generates the intersection of U and W where U is the xj/ plane: 
U = {{a, b, 0)}, and W is the space generated by the vectors (1, 2, 3) and (1, —1, 1). 

4.62. Prove: L(S) is the intersection of all the subspaces of V containing S. 



84 VECTOR SPACES AND SUBSPACES [CHAP. 4 

4.63. Show that L{S) = L(S u{0}). That is, by joining or deleting the zero vector from a set, we do not 
change the space generated by the set. 

4.64. Show that if S c T, then L(S) c L(T). 

4.65. Show that LmS)) = L(S). 

ROW SPACE OF A MATRIX 

4.66. Determine which of the following matrices have the same row space: 

/l -1 3\ 



.-<: -. :,, -(.ra-O' '^ = 1:-'° 



(3 -4 5)' 



4.67. Let Ml = (1, 1, -1), U2 = (2, 3, -1), M3 = (3, 1, -5) 

vi = (1,-1,-3), V2 = (3,-2,-8), Vs = (2,1,-3) 
Show that the subspace of R* generated by the it; is the same as the subspace generated by the Vj. 

4.68. Show that if any row of an echelon (row reduced echelon) matrix is deleted, then the resulting 
matrix is still in echelon (row reduced echelon) form. 

4.69. Prove the converse of Theorem 4.6: Matrices with the same row space (and the same size) are 
row equivalent. 

4.70. Show that A and B have the same column space iff A« and B* have the same row space. 

4.71. Let A and B be matrices for which AB is defined. Show that the column space of AB is contained 
in the column space of A. 

SUMS AND DIRECT SUMS 

4.72. We extend the notion of sum to arbitrary nonempty subsets (not necessarily subspaces) S and T of 
a vector space V by defining S + T = {s + t : sG S, tG T}. Show that this operation satisfies: 

(i) commutative law: S+ T = T + S; 

(ii) associative law: (Si + S2) + S3 = Si + (S2 + S3) ; 

(iii) S + {0} = {0} + S = S; 

(iv) S + V = V + S = V. 

4.73. Show that for any subspace W of a vector space V, W + W = W. 

4.74. Give an example of a subset S of a vector space V which is not a subspace of V but for which 

(i) S + S = S, (ii) S + S C S (properly contained). 

4.75. We extend the notion of sum of subspaces to more than two summands as follows. If T^i, W^, . . . , T^„ 
are subspaces of V, then 

Wi + W2+---+W„ = {wi + Wi+'-'+w^: WiGWi) 
Show that: 

(i) L{Wi, W2 W„) ^ W, + W2+ ■■■ + W„; 

(ii) if Si generates Wi, i = 1, . . .,n, then Si U S2 U • • • U S„ generates W^ + W2 + • • ■ + Wn- 

4.76. Suppose U, V aijd W are subspaces of a vector space. Prove that 

(UnV) + (UnW) c Un(V+W) 

Find subspaces of B? for which equality does not hold. 



CHAP. 4] VECTOR SPACES AND SUBSPACES 85 

4.77. Let U, V and W be the following subspaces of R^: 

U = {(a, b,c): a+b + c = 0}, V = {(a, b,c): a = c}, W = {(0, 0, c) : c e K} 
Show that (i) R3 = V + V, (ii) B,» =^ V + W, (iii) R^ = V + W. When is the sum direct? 

4.78. Let V be the vector space of all functions from the real field B into B. Let U be the subspace of 
even functions and W the subspace of odd functions. Show that V = U ®W. (Recall that / is 
even iff f{-x) - f{x), and / is odd iff f(-x) = -f(x).) 

4.79. Let Wi, W^, ... be subspaces of a vector space V for which Wy<zW.i,<Z- • ■ . Let PF = Wj U TFj U • • ■ . 
Show that W is a subspace of Y. 

4.80. In the preceding problem, suppose Sj generates W^, i = 1, 2, Show that S = Sj U Sa U ■ • • 

generates W. 

4.81. Let V be the vector space of w-square matrices over a field K. Let U be the subspace of upper 
triangular matrices and W the subspace of lower triangular matrices. Find (i) U + W, (ii) UnW. 

4.82. Let V be the external direct sum of the vector spaces U and W over a field K. (See Problem 4.45.) 
Let A ys. 

U = {(m,0): uGU}, W = {(0,w): w & W} 

Show that (i) U and W are subspaces of V, (ii) V = U ® W. 



Answers to Supplementary Problems 

4.47. (i) Yes. (iv) Yes. 

(ii) No;e.g. (1,2,3) GW but -2(1, 2, 3) € W'. (v) No; e.g. (9,3,0)eW' but 2(9,3,0) ^W. 

(iii) No;e.g. (1, 0, 0), (0, 1, 0) G T7, (vi) Yes. 
but not their sum. 

4.50. (1) Let f,g GW with M^ and Mg bounds for / and g respectively. Then for any scalars a, 6 G R, 

\(af+bg)(x)\ = \af(x) + bg(x)\ ^ \af(x)\ + \bg(x)\ = |a| |/(*)| + |6| |ff(a;)| ^ \a\Mf+\b\Mg 
That is, \a\Mf + \b\Mg is a bound for the function af + bg. 

(ii) (af + bg)(-x) = af(-x) + bg(-x) = af(x) + bg(x) = (af + bg)(x} 

4.51. No. Although one may "identify" the vector (a, b) G R2 with, say, (a, b, 0) in the xy plane in R3, 
they are distinct elements belonging to distinct, disjoint sets. 

4.54. (i) -3m + 2v. (ii) Impossible, (iii) k = -8. (iv) o - 36 — 5c = 0. 

4.55. (i) u — 2v — w. (ii) Impossible. 

4.56. (i) E = 2A- B + 2C. (ii) Impossible. 
4.61. (2, -5, 0). 

4.66. A and C. 

4.67. Form the matrix A whose rows are the Mj and the matrix B whose rows are the Uj, and then show 
that A and B have the same row canonical forms. 

4.74. (i) InR2,let S = {(0,0), (0,1), (0,2), (0,3), . . .}. 
(ii) InR2, let S = {(0,5), (0,6), (0,7), ...}. 

4.77. The sum is direct in (ii) and (iii). 

4.78. Hint. f(x) = ^(f(x) + f(-x)) + ^f(x) - f(-x)), where ^(f(x) + /(-»)) is even and ^(f(x) - f(-x)) 
is odd. 

4.81. (i) V = U + W. (ii) J7nW is the space of diagonal matrices. 



chapter 5 



Basis and Dimension 

INTRODUCTION 

Some of the fundamental results proven in this chapter are: 

(i) The "dimension" of a vector space is well defined (Theorem 5.3). 

(ii) If V has dimension n over K, then V is "isomorphic" to K" (Theorem 5.12). 

(iii) A system of linear equations has a solution if and only if the coefficient and 
augmented matrices have the same "rank" (Theorem 5.10). 

These concepts and results are nontrivial and answer certain questions raised and investi- 
gated by mathematicians of yesterday. 

We will begin the chapter with the definition of linear dependence and independence. 
This concept plays an essential role in the theory of linear algebra and in mathematics in 
general. 

LINEAR DEPENDENCE 

Definition: Let F be a vector space over a field Z. The vectors vi, . . .,Vm&V are said 

to be linearly dependent over K, or simply dependent, if there exist scalars 

ai, . . .,am&K, not all of them 0, such that 

aiVi + aiVi + • • ■ + dmVm = (*) 

Otherwise, the vectors are said to be linearly independent over K, or simply 
independent. 

Observe that the relation (*) will always hold if the a's are all 0. If this relation holds 
only in this case, that is, 

aiVi + a^Vi + • • • + OmVm - only if ai = 0, . . . , Om = 
then the vectors are linearly independent. On the other hand, if the relation (*) also holds 
when one of the a's is not 0, then the vectors are linearly dependent. 

Observe that if is one of the vectors vi, ...,Vm, say vi = 0, then the vectors must be 

dependent; for 

Ivi + Ov2 + • • • + Ot;m = 1 • + + • • • + = 

and the coefficient of Vi is not 0. On the other hand, any nonzero vector v is, by itself, 

independent; for , ^ ,a • t i. a 

*^ kv = 0, V ¥= implies A; = 

Other examples of dependent and independent vectors follow. 

Example 5.1: The vectors m = (1,-1,0), 1; = (1,3,-1) and w = (5, 3, -2) are dependent since, 
for 3m + 2v — w = 0, 

3(1, -1, 0) + 2(1, 3, -1) - (5, 3, -2) = (0, 0, 0) 

86 



CHAP. 5] BASIS AND DIMENSION 87 

Example 5.2: We show that the vectors u = (6, 2, 3, 4), v = (0, 5, -3, 1) and w = (0, 0, 7, -2) 
are independent. For suppose xu + yv + zw = where x, y and z are unknown 
scalars. Then 

(0, 0, 0, 0) = x{6, 2, 3, 4) + y{0, 5, -3, 1) + z{0, 0, 7, -2) 

= (6a;, 2x + 5y, 3x-Sy + Iz, Ax + y- 2z) 

and so, by the equality of the corresponding components, 

6a; =0 

2x + hy =0 

Zx-Zy + lz - 

4a; + y — 2z = 

The first equation yields a; = 0; the second equation with x = yields y = 0; and 
the third equation with a; = 0, y — Q yields 2 = 0. Thus 

xu-'t yv + zw = implies x = 0, y = 0, z = 
Accordingly u, v and w are independent. 
Observe that the vectors in the preceding example form a matrix in echelon form: 




Thus we have shown that the (nonzero) rows of the above echelon matrix are independent. 
This result holds true in general; we state it formally as a theorem since it will be frequently 
used. 

Theorem 5.1: The nonzero rows of a matrix in echelon form are linearly independent. 

For more than one vector, the concept of dependence can be defined equivalently as 
follows: 

The vectors Vi, . . .,Vm are linearly dependent if and only if one of them is a linear 
combination of the others. 

For suppose, say, Vi is a linear combination of the others: 

Vi = aiVi + • • • + Ui-iVi-i + tti+iVi + i + • • • + UrnVm 

Then by adding —Vi to both sides, we obtain 

aiVl + • • • + Oi-iVi-i — Vi + Ui + lVi + i + • • • + amVm — 

where the coefficient of Vi is not 0; hence the vectors are linearly dependent. Conversely, 
suppose the vectors are linearly dependent, say, 

biVi + • • • + bjVj + • • • + bmVm = where bj ¥- 

Then Vj ~ —bi^biVi — ••• — bf^bj-iVj-i — bi^bj+iVj+i — ••• — bi^bmVm 

and so Vj is a linear combination of the other vectors. 

We now make a slightly stronger statement than that above; this result has many im- 
portant consequences. 

Lemma 5.2: The nonzero vectors Vi, . . .,Vm are linearly dependent if and only if one of 
them, say vt, is a linear combination of the preceding vectors: 

Vi — kiVi + kiVi + • • • + fci-iVi-i 



88 



BASIS AND DIMENSION 



[CHAP. 5 



Remark 1. The set [vi, . . .,Vm} is called a dependent or independent set according as the 
vectors vi, . . .,Vm are dependent or independent. We also define the empty- 
set to be independent. 

Remark 2. If two of the vectors Vi, . . .,Vm are equal, say vi - vz, then the vectors are 
dependent, For , a., , . n., - n 

and the coefficient of i;i is not 0. 

Remark 3. Two vectors Vi and v^ are dependent if and only if one of them is a multiple of 
the other. 

Remark 4. A set which contains a dependent subset is itself dependent. Hence any 
subset of an independent set is independent. 

Remark 5. If the set {vu . . . , Vm} is independent, then any rearrangement of the vectors 
{Vij, Vi^, . . . , Vi„} is also independent. 

Remark 6. In the real space R^ dependence of vectors can be described geometrically as 
follows: any two vectors u and v are dependent if and only if they lie on the 
same line through the origin; and any three vectors u, v and w are dependent 
if and only if they lie on the same plane through the origin: 





u and V are dependent. 



u, V and w are dependent. 



BASIS AND DIMENSION 

We begin with a definition. 
Definition: A vector space V is said to be of finite dimension n or to be n-dimensional, 
written dim V = n, if there exists linearly independent vectors ei, e2, . . . , e„ 
which span V. The sequence {ei, 62, ... , e„) is then called a basis of V. 
The above definition of dimension is well defined in view of the following theorem. 
Theorem 5.3: Let F be a finite dimensional vector space. Then every basis of V has the 
same number of elements. 

The vector space {0} is defined to have dimension 0. (In a certain sense this agrees with 
the above definition since, by definition, is independent and generates {0}.) When a 
vector space is not of finite dimension, it is said to be of infinite dimension. 

Example 5.3: Let K be any field. Consider the vector space K" which consists of n-tuples of ele- 
ments of K. The vectors m n n n n\ 

ei = (1, 0, 0, . . ., 0, 0) 

62 = (0,1,0, ...,0,0) 



e„ = (0,0,0 0,1) 

form a basis, called the usual basis, of K". Thus K" has dimension n. 



CHAP. 5] 



BASIS AND DIMENSION 



89 



Example 5.4: Let U be the vector space of all 2 X 3 matrices over a field K. Then the matrices 

/I 0\ /O 1 ON /O 1\ 

Vo 0/' i^o 0/' Vo 0/' 



( 




10 




10 




1 



Example 5.5: 



form a basis of U. Thus dim C/ = 6. More generally, let V be the vector space 
of all m X % matrices over K and let E^ S y be the matrix with ly-entry 1 and 
elsewhere. Then the set {ffy} is a basis, called the usual basis, of V (Problem 5.32); 
consequently dim V — mn. 

Let W be the vector space of polynomials (in t) of degree — n. The set {1, t,t^, . . ., t"} 
is linearly independent and generates W. Thus it is a basis of W and so 
dim W = n+1. 

We comment that the vector space V of all polynomials is not finite dimensional 
since (Problem 4.26) no finite set of polynomials generates V. 

The above fundamental theorem on dimension is a consequence of the following im- 
portant "replacement lemma": 

Lemma 5,4: Suppose the set {vi, V2, . . ., Vn} generates a vector space V. If {wi, . . ., Wm} 
is linearly independent, then m — n and V is generated by a set of the form 

{Wi, ...,Wm, Vij, . . . , Vi^_J 

Thus, in particular, any % + 1 or more vectors in V are linearly dependent. 

Observe in the above lemma that we have replaced m of the vectors in the generating 
set by the m independent vectors and still retained a generating set. 

Now suppose S is a subset of a vector space V. We call {vi, . . ., Vm} a maximal in- 
dependent subset of S if: 

(i) it is an independent subset of S; and 

(ii) {vi, . . .,Vm,w} is dependent for any w e S. 

The following theorem applies. 

Theorem 5.5: Suppose S generates V and {vi, . . ., Vm} is a maximal independent subset 
of S. Then (vi, . . . , Vm} is a basis of V. 

The main relationship between the dimension of a vector space and its independent 
subsets is contained in the next theorem. 

Theorem 5.6: Let V be of finite dimension n. Then: 

(i) Any set of n + 1 or more vectors is linearly dependent. 

(ii) Any linearly independent set is part of a basis, i.e. can be extended to 

a basis, 
(iii) A linearly independent set with n elements is a basis. 

Example 5.6: The four vectors in K* 

(1,1,1,1), (0,1,1,1), (0,0,1,1), (0,0,0,1) 

are linearly independent since they form a matrix in echelon form. Furthermore, 
since dim K* = 4, they form a basis of K*. 

Example 5.7: The four vectors in R3, 

(257,-132,58), (43,0,-17), (521,-317,94), (328,-512,-731) 
must be linearly dependent since they come from a vector space of dimension 3. 



90 



BASIS AND DIMENSION 



[CHAP. 5 



DIMENSION AND SUBSPACES 

The following theorems give basic relationships between the dimension of a vector space 
and the dimension of a subspace. 

Theorem 5.7: Let W he a subspace of an n-dimension vector space V. Then dim W -n. 
In particular if dim W = n, then W — V. 

Example 5.8: Let W be a- subspace of the real space B?. Now dim R^ = 3; hence by the preced- 
ing theorem the dimension of W can only be 0, 1, 2 or 3. The following cases apply: 

(i) dim W - 0, then W = {0}, a point; 

(ii) dim W = 1, then W is a line through the origin; 

(iii) dim W = 2, then T^ is a plane through the origin; 

(iv) dim W — S, then W is the entire space R^. 

Theorem 5.8: Let U and W be finite-dimensional subspaces of a vector space V. Then 
U + W has finite dimension and 

dim(C7 + F) = dim f/ + dim TF - dim{UnW) 

Note that if V is the direct sum of U and W, i.e. V = U ® W, then dim V = 
dim U + dim W (Problem 5.48). 

Example 5.9: Suppose U and W are the xy plane and yz plane, respectively, in R^: U = {(a, 6,0)}, 
W = {(0, 6, c)}. Since RS = t/ + I^, dim {U+W) = B. Also, dim 17 = 2 and 
dim W — 2. By the above theorem, 

3 = 2 + 2-dim(f/nTF) or dim{UnW) = 1 
Observe that this agrees with the fact that UnW is the y axis, i.e. UnW = 
{(0, 6, 0)}, and so has dimension 1. 







z 








w 








^Vr\W 




^,/^ V 


^^0 ^^ 




..^ 


r ■ 





RANK OF A MATRIX 

Let A be an arbitrary m x w matrix over a field K. Recall that the row space of A is 
the subspace of K" generated by its rows, and the column space of A is the subspace of R^ 
generated by its columns. The dimensions of the row space and of the column space of A 
are called, respectively, the row rank and the column rank of A. 

Theorem 5.9: The row rank and the column rank of the matrix A are equal. 

Definition: The rank of the matrix A, written rank (A), is the common value of its row 
rank and column rank. 

Thus the rank of a matrix gives the maximum number of independent rows, and also 
the maximum number of independent columns. We can obtain the rank of a matrix as 

follows. 

/I 2 -1\ 
Suppose A = 2 6 -3 -3 . We reduce A to echelon form using the elementary 

row operations: \ 3 10—6 —5/ 



1 


2 





-1 




1 


2 





-1 





2 


-3 


-1 


to 





2 


-3 


-1 





4 


-6 


-2 




\o 












CHAP. 5] BASIS AND DIMENSION 91 



A to 



Recall that row equivalent matrices have the same row space. Thus the nonzero rows of the 
echelon matrix, which are independent by Theorem 5.1, form a basis of the row space of 
A. Hence that rank of A is 2. 

APPLICATIONS TO LINEAR EQUATIONS 

Consider a system of m linear equations in n unknowns a;i, . . . , «„ over a field K: 

ttuXl + ai2X2 + • • • + ainXn = bi 
CziaJi + a22X2 + • • • + ChnXn = ^2 



an 


ai2 


ttln 


bi 


a2i 


022 


. a2n 


62 


ttml 


ftm2 


. . dmn 


hm 



dmlXi + am2X2 + • • • + OmnXn — &m 

or the equivalent matrix equation 

AX = B 

where A — (an) is the coefficient matrix, and X = (xi) and B = (6i) are the column vectors 
consisting of the unknowns and of the constants, respectively. Recall that the augmented 
matrix of the system is defined to be the matrix 



{A,B) 



Remark 1. The above linear equations are said to be dependent or independent according 
as the corresponding vectors, i.e. the rows of the augmented matrix, are 
dependent or independent. 

Remark 2. Two systems of linear equations are equivalent if and only if the corresponding 
augmented matrices are row equivalent, i.e. have the same row space. 

Remark 3. We can always replace a system of equations by a system of independent 
equations, such as a system in echelon form. The number of independent 
equations will always be equal to the rank of the augmented matrix. 

Observe that the above system is also equivalent to the vector equation 





Thus the system AX = B has a solution if and only if the column vector B is a linear 
combination of the columns of the matrix A, i.e. belongs to the column space of A. This 
gives us the following basic existence theorem. 

Theorem 5.10: The system of linear equations AX — B has a solution if and only if the 
coefficient matrix A and the augmented matrix (A, B) have the same rank. 



92 



BASIS AND DIMENSION 



[CHAP. 5 



Recall (Theorem 2.1) that if the system AX = B does have a solution, say v, then its 
general solution is of the form v + W = {v + w: wGW} where W is the general solution 
of the associated homogeneous system AX = 0. Now W is a subspace of K" and so has a 
dimension. The next theorem, whose proof is postponed until the next chapter (page 127), 
applies. 

Theorem 5.11: The dimension of the solution space W of the homogeneous system of linear 
equations AX = is n-r where n is the number of unknowns and r is 
the rank of the coefficient matrix A. 

In case the system AX = is in echelon form, then it has precisely n-r free variables 
(see page 21), say, Xi^.xi^, . . .,Xi^_^. Let Vj be the solution obtained by setting aji^. = 1, 
and all other free variables = 0. Then the solutions Vi, . . ., v^-r are linearly independent 
(Problem 5.43) and so form a basis for the solution space. 

Example 5.10: Find the dimension and a basis of the solution space W of the system of linear 

equations 

x + 2y- Az + 3r- s = 

X + 2y -2z + 2r+ s - 
2x + iy -2z + 3r + 4s = 

Reduce the system to echelon form: 
X + 2y - Az + 3r — s = 

2z — r + 2s = and then 

6z - 3r + 68 = 



X + 2y 



4z + 3r - s = 
2z - r + 2s = 



There are 5 unknowns and 2 (nonzero) equations in echelon form; hence dim W - 
5 — 2 = 3. Note that the free variables are y, r and s. Set: 

(i) J/ = 1, r = 0, s = 0, (ii) 3/ = 0, r = 1, s = 0, (iii) y = 0, r = 0, s = l 

to obtain the following respective solutions: 

vi = (-2,1,0,0,0), V2 = (-1, 0, f 1, 0), vs = (-3,0,-1,0,1) 

The set {^i, V2, V3} is a basis of the solution space W. 



COORDINATES 

Let {ei, . . . , e„} be a basis of an n-dimensional vector space V over a field K, and let v 
be any vector in V. Since {ei} generates V, 1; is a linear combination of the d: 

:K 



ttiCi + aiCi + 



+ dnen, OH 



Since the d are independent, such a representation is unique (Problem 5.7), i.e. the n 
scalars ai, . . ., a„ are completely determined by the vector v and the basis {ei}. We call 
these scalars the coordinates of v in {ei}, and we call the «-tuple (ai, . . ., a„) the coordinate 
vector of v relative to {ei} and denote it by [v]e or simply [v]: 

[V]e — (tti, ai, . . ., On) 

Example 5.11 : Let V be the vector space of polynomials with degree ^ 2: 

V - {af2 + 5t + c : a, 6, c e R} 
The polynomials 

ei = 1, 02 = t-1 and 63 = (t- 1)2 = i^ - 2t + 1 
form a basis for V. Let v = 2t2-5t + 6. Find [v\„ the coordinate vector of v 
relative to the basis {cj, e^, 63}. 



CHAP. 5] BASIS AND DIMENSION 93 

Set V as a linear combination of the e; using the unknowns x, y and z: v = xe^ + 
yCi + zeg. 

2«2 - 5« + 6 = x(\) + y(t - 1) + z(f2 - 2« + 1) 

= a; + 3/f - 3/ + «t2 _ 2zt + z 
= zfi + (y- 2z)t + (x-y + z) 
Then set the coefficients of the same powers of t equal to each other: 

X — y + z — 6 
y -2z = -5 
z = 2 
The solution of the above system is a: = 3, j/ = —1, z = 2. Thus 
V = 3ei - 62 + 2e3, and so [v]^ = (3, -1, 2) 

Example 5.12: Consider the real space tt?. Find the coordinate vector of r = (3, 1, -4) relative to 
the basis /i = (1, 1. 1), /a = (0, 1, 1), /s = (0, 0, 1). 

Set r as a linear combination of the /; using the unknowns x, y and z: v = xfi + 

(3,1,-4) = a;(l, 1, 1) + j/(0, 1, 1) + z(0, 0, 1) 
^ (x, X, x) + (0, y, y) + (0, 0, 2) 
= (x,x + y,x-\-y + z) 

Then set the corresponding components equal to each other to obtain the equivalent 
system of equations 

X =3 

X + y =1 

X -\- y + z = —4 

having solution x = 3, y = -2, z = -5. Thus [v]f = (3, -2, -5). 

We remark that relative to the usual basis e^ = (1, 0, 0), ej = (0, 1, 0), eg = 
(0, 0, 1), the coordinate vector of v is identical to v itself: [v]^ = (3, 1, -4) = v. 

We have shown above that to each vector vGV there corresponds, relative to a given 
basis {ei, . . ., e„}, an n-tuple [v]e in K\ On the other hand, if (ai, . . .,a„) G j?«, then there 
exists a vector in V of the form aiCi + • • • + a„e„. Thus the basis {d} determines a one-to- 
one correspondence between the vectors in V and the w-tuples in K". Observe also that if 

V = ttiei + • • • + a„e„ corresponds to (ai, . . . , a„) 
an<i w = biBi + ■ • • + 6„e„ corresponds to (&i, . . . , 6„) 

then 

t; + i« = (ai + &,)ei + • • • + (a„ + 6„)e„ corresponds to (ai, . . . , a„) + (bi, . . . , b„) 
and, for any scalar k G K, 

kv = (A;ai)ei + • • • + {kan)e„ corresponds to k{ai, . . . , a„) 
That is, [v + w]e = M« + [w]e and [kv]e = k[v]e 

Thus the above one-to-one correspondence between V and K" preserves the vector space 
operations of vector addition and scalar multiplication; we then say that V and K" are 
isomorphic, written V ^ K"". We state this result formally. 

Theorem 5.12: Let V be an «-dimensional vector space over a field K. Then V and K^ are 
isomorphic. 



94 BASIS AND DIMENSION [CHAP. 5 

The next example gives a practical application of the above result. 

Example 5.13: Determine whether the following matrices are dependent or independent: 



^=a I'D- "^G 



3 -4\ c = f ^ ^ ~" 
6 5 AJ' U6 10 9 



The coordinate vectors of the above matrices relative to the basis in Example 
5.4, page 89, are 

[A] = (1,2,-3,4,0,1), [B] = (1,3,-4,6,5,4), [C] = (3,8,-11,16,10,9) 
Form the matrix M whose rows are the above coordinate vectors: 



M = 



Row reduce M to echelon form: 

40l\ /12-3 40 





M to I 1-1 2 5 3 to 1-1 2 5 

4 10 6/ \00000 

Since the echelon matrix has only two nonzero rows, the coordinate vectors [A], [B] 
and [C] generate a space of dimension 2 and so are dependent. Accordingly, the 
original matrices A, B and C are dependent. 



Solved Problems 

LINEAR DEPENDENCE 

5.1. Determine whether or not u and v are linearly dependent if: 

(i) u = (3,4), V = (1,-3) (iii) u - (4,3,-2), v = (2,-6,7) 

(ii) u = (2,-3), V = (6,-9) (iv) u = (-4,6,-2), v = (2,-3,1) 

(V) « = /l-2 4N /2-4 8\ (,i), = /l 2-3N ^/6-5 4 

' [so -1/ [g 0-2/ ^6 -5 i) [l 2 -3 

(vii) M = 2 - 5* + 6*2 - i3, V = 3 + 2t-4t^ + 5t^ 

(viii) u = l-St + 2t^- St\ V = -S + 9t-6t^ + 9«' 



3 



Two vectors u and v are dependent if and only if one is a multiple of the other. 

(i) No. (ii) Yes; for v = 3m. (iii) No. (iv) Yes; for u = —2v. (v) Yes; for v = 2m. (vi) No. 
(vii) No. (viii) Yes; for v = —3m. 

5.2. Determine whether or not the following vectors in R^ are linearly dependent: 

(i) (1,-2,1), (2, 1,-1), (7, -4,1) (iii) (1,2,-3), (1,-3,2), (2,-1,5) 

(ii) (1, -3, 7), (2, 0, -6), (3, -1, -1), (2, 4, -5) (iv) (2, -3, 7), (0, 0, 0), (3, -1, -4) 

(i) Method 1. Set a linear combination of the vectors equal to the zero vector using unknown 

scalars x, y and z: 

x(l, -2, 1) + 2/(2, 1, -1) + 2(7, -4, 1) = (0, 0, 0) 



CHAP. 5] BASIS AND DIMENSION 95 

Then (x, -2x, x) + (2v, y, -y) + (Iz, -4z, z) = (0, 0, 0) 

0^ (x + 2y + 7z, -2x + y - 4:Z, x - y + z) - (0, 0, 0) 

Set corresponding components equal to each other to obtain the equivalent homogeneous system, 
and reduce to echelon form: 

X + 2y + 7z - X + 2y + 7z = 

o, .„ X + 2y + 7z = 

-2x + 2/ - 4z = or 5j/ + lOz = or 

y\-2z = 
X- y + z = -3j/ - 6z = 

The system, in echelon form, has only two nonzero equations in the three unknowns; hence the 
system has a nonzero solution. Thus the original vectors are linearly dependent. 

Method 2. Form the matrix whose rows are the given vectors, and reduce to echelon form using 
the elementary row operations: 



to 5-3 to 



Since the echelon matrix has a zero row, the vectors are dependent. (The three given vectors 
generate a space of dimension 2.) 

(ii) Yes, since any four (or more) vectors in R3 are dependent. 

(iii) Form the matrix whose rows are the given vectors, and row reduce the matrix to echelon form: 



to 



Since the echelon matrix has no zero rows, the vectors are independent. (The three given vectors 
generate a space of dimension 3.) 








(iv) Since = (0, 0, 0) is one of the vectors, the vectors are dependent. 



5.3. Let V be the vector space of 2 x 2 matrices over R. Determine whether the matrices 
A,B,C GV are dependent where: 

(i) Set a linear combination of the matrices A, B and C equal to the zero matrix using unknown 
scalars x, y and z; that is, set xA + yB + zC = 0. Thus: 



C :) HID- c :) ^ c :) 

/x + y + z x + z\ _ f^ ^\ 
\ X x + yj ~ \0 0) 



96 BASIS AND DIMENSION [CHAP. 5 

Set corresponding entries equal to each other to obtain the equivalent homogeneous system of 

equations: 

X + y + z = 

X + z = 

X =0 

X + y =0 

Solving the above system we obtain only the zero solution, x = 0, y = 0, z = 0. We have 
shown that xA + yB + zC implies a; = 0, y = 0, z = 0; hence the matrices A,B and C are 
linearly independent. 

(ii) Set a linear combination of the matrices A,B and C equal to the zero vector using unknown 
scalars x, y and z; that is, set xA + yB + zC = 0. Thus: 

'1 2\ /3 -1\ / 1 -5\ /O ON 



/ X 2x\ /Zy -y\ / z -5z\ _ /O 

°^ \3x x) '^ \2y 2y) '^ \-Az / ~ 1,0 

X + 3y + z 2x-y - 5z\ /O 

3x + 2y- 4:Z x + 2y J ~ \0 

Set corresponding entries equal to each other to obtain the equivalent homogeneous system of 
linear equations and reduce to echelon form: 



X + 3y + z = 




X 


+ 3y+z 


= 


2x - y — 5z - 






-ny - 7z 


= 


3x + 2y -4z = 


or 




-ly - 7z 


= 


x + 2y =0 






-y — z 


= 


X + Sy + z 


- 







y 


+ z 


= 








or finally 

The system in echelon form has a free variable and hence a nonzero solution, for example, 
X = 2, y = -1, z = 1. We have shown that xA + yB + zC = does not imply that x = 0, 
j^ = 0, z = 0; hence the matrices are linearly dependent. 

5.4. Let V be the vector space of polynomials of degree ^ 3 over R. Determine whether 
u,v,w gV are independent or dependent where: 

(i) u ^ i^-Sf^ + Bt + l, V = t^-t^ + St + 2, w = 2*8 - 4*2 + 9t + 5 
(ii) u = 1^ + 4t^-2t + S, V = t^ + 6t^-t + 4, w = B1^ + St^ - St + 7 

(i) Set a linear combination of the polynomials u,v and w equal to the zero polynomial using 
unknown scalars x, y and z; that is, set xu + yv + zw = 0. Thus: 

x(t» - 3*2 + 5t + l) + y(t^ - t2 + 8t + 2) + z{2fi -U^ + 9t + 5) = 
or xfi - 3xt^ + 5xt + x + yt^ - yt^ + Syt + 2y + 2zt» - izt^ + 9zt + 5z = 

or (x + y + 2z)t? + {-3x -y- iz)t^ + (5* + 8y + 9z)t + {x + 2y + 5z) = 

The coefficients of the powers of t must each be 0: 

X + y + 2z = 

—3a; — 2/ — 4z = 

5x + 8y + 9z - 

x + 2y + 5z = 

Solving the above homogeneous system, we obtain only the zero solution: x = 0, y = 0, z = 0; 
hence u, v and w are independent. 



CHAP. 5] BASIS AND DIMENSION 97 

(ii) Set a linear combination of the polynomials u,v and w equal to the zero polynomial using 
unknown scalars x, y and z\ that is, set xu + yv + zw = 0. Thus: 

x{ti + 4«2 - 2t + 3) + 3/(t3 + 6f2 - f + 4) + 2(3*3 + §(2 _ gt + 7) = 
or xt^ + 4a;t2 - 2xt + Zx + yt» + 6yfi - yt + ^y + 3zt^ + izt^ - Szt + 7z = 

or (x + y + 3z)i3 + (4a; + 62/ + Sz)t^ + (-2x -y- 8z)t + {3x + 4y + 7z) = 

Set the coefficients of the powers of t each equal to and reduce the system to echelon form: 

X + y + 3z = X + y + 3z = 

4a; + 62/ + 82 = 2y - 4:Z - 

or 
-2x - y - 8z = y -2z = 

3x + Ay + 7z = y - 2z = 

or finally a; + 3/ + 82 = 

y -2z = 

The system in echelon form has a free variable and hence a nonzero solution. We have shown 
that xu + yv + zw = does not imply that x = 0, y = 0, z - 0; hence the polynomials are 
linearly dependent. 

5.5. Let V be the vector space of functions from R into R. Show that f,g,hGV are 
independent where: (i) f{t) = e^, g{t) = t\ h{t) = t; (ii) f{t) = sint, g{t) = cost, 
h{t) — t. 

In each case set a linear combination of the functions equal to the zero function using unknown 
scalars x, y and z: xf -\- yg + zh = 0; and then show that a; = 0, j/ = 0, z = 0. We emphasize that 
xf + yg -\- zh - {) means that, for every value of t, xf{t) + yg(t) + zh(t) = 0. 

(i) In the equation xe^* + yt^ + zt = 0, substitute 

t = to obtain xe" + j/0 + zO = or a; = 

* = 1 to obtain xe^ + y -\- z =0 

t = 2 to obtain xe* + 4y + 2z = 



Solve the system - 



X =0 

xe^ + y + z = to obtain only the zero solution: x = 0, y = 0, z — 0. 

xe* + Ay + 2z = 
Hence /, g and h are independent. 



(Ii) Method 1. In the equation a; sin t + 3/ cos « + zt = 0, substitute 

* = to obtain x-Q + y •! + z'Q = Q or y = Q 
t = tt/2 to obtain a; • 1 + j/ • + Zff/2 = or a; + 7rzl2 — 
t = TT to obtain a; • + y(—l) + z • jr = or —y + Trz = 

V = 

X + jrz/2 = to obtain only the zero solution: x = 0, y = 0, z = 0. Hence 
-y + vz = 
/, g and h are independent. 



Solve the system 



Method 2. Take the first, second and third derivatives of x sin t + 2/ cos t + z« = with 
respect to t to get 

X cos t — 2/ sin t + z = (1) 

— X sin t — y cos t = (2) 

— X cos i + 2/ sin t = (5) 



9g BASIS AND DIMENSION [CHAP. 5 

Add (1) and (5) to obtain z = 0. Multiply (2) by sin t and (3) by cos t, and then add: 
sin t X (2): — a; sin^ t — y sin t cost = 
cos t X (3): —X cos^ t + y sint cos t = 

-x(sin2 1 + cos2 f) = or a; = 

Lastly, multiply (2) by — cos t and (5) by sin t; and then add to obtain 

2/(cos2 e + sin2 = or y = 
Since x sint + y cost + zt — implies a; = 0, j/ = 0, a = 

/, fir and h are independent. 

5.6. Let u, V and w be independent vectors. Show that u + v, u-v and u-2v + w are 
also independent. 

Suppose x(u + v) + y(u -v) + z(u-2v + w) - where x, y and z are scalars. Then 
xu + XV + yu — yv + zu — 2zv + «w = or 

{x + y + z)u+ (x — y — 2z)v + zw = 

But u, V and w are linearly independent; hence the coefficients in the above relation are each 0: 

X + y + z = 
x-y -2z = 

2 = 

The only solution to the above system is x = 0, y = 0, z - 0. Hence u + v, u-v and u-2v + w are 
independent. 

5.7. Let vi, V2, . ..,Vm be independent vectors, and suppose m is a linear combination of 
the Vi, say u = aiVi + onVz + • • • + OmVm where the ai are scalars. Show that the 
above representation of u is unique. 

Suppose u = biVi + b2V2 + • • • + b„v„ where the 6j are scalars. Subtracting, 

= u — u = (»! — bi)vi + (a2 — 62)^2 + ■ ■ ■ + («m ~ bm)Vm 

But the Vi are linearly independent; hence the coefficients in the above relation are each 0: 

ai — bi = 0, a2— b2 = 0, .. ., «« ~" *m = 

Hence a, = 61, 02 = 62, ..., a^ = b^ and so the above representation of m as a linear combination 
of the Vi is unique. 

5.8. Show that the vectors v = (l+i, 2i) and w = {l,l+i) in C^ are linearly dependent 
over the complex field C but are linearly independent over the real field R. 

Recall that 2 vectors are dependent iff one is a multiple of the other. Since the first coordinate 
of w is 1, I* can be a multiple of w iff v = (l + i)w. But 1 + i € R; hence v and w are independent 
overR. Since (i+i)w = (l + i)(l, 1 + i) = (1 + 1, 2i) = v 

and 1 + t G C, they are dependent over C. 

5.9. Suppose S = {vi, ...,Vm} contains a dependent subset, say {vi, ...,Vr}. Show that 
S is also dependent. Hence every subset of an independent set is independent. 

Since {v^, ...,v^} is dependent, there exist scalars a^, ...,a^, not all 0, such that 

a^vi + a^v^ + 



CHAP. 5] BASIS AND DIMENSION 99 

Hence there exist scalars aj flr. 0, . . . , 0, not all 0, such that 

aiVi + • • • + a^Vr + O^r+j + • • ■ + Ov^ = 
Accordingly, S is dependent. 

5.10. Suppose {vi, . . .,Vm} is independent, but {vi, . . .,Vm,w} is dependent. Show that w 
is a linear combination of the i;;. 

Method 1. Since {vj, . . . , v^, w} is dependent, there exist scalars «!,..., «„. ^. "o* *11 0, such that 
Oj^x + • • • + (im'"m + 6w = 0. If 6 = 0, then one of the aj is not zero and Oi^yj + • • • + a^v^ = 0. 
But this contradicts the hypothesis that {v^, .... i>„} is independent. Accordingly, 6 # and so 

w = 6-i(-aiVi a„0 = -b-^a^Vi - ■ ■ ■ - b-^a^Vm 

That is, w is a linear combination of the Vi. 

Method 2. If w = 0, then w = Ovi + • • • + Ov^. On the other hand, if w ^^ then, by Lemma 
5.2, one of the vectors in {dj, . . . , v^, w} is a linear combination of the preceding vectors. This 
vector cannot be one of the v's since {vj, . . . , v^} is independent. Hence w is a linear combination 
of the v^. 

PROOFS OF THEOREMS 

5.11. Prove Lemma 5.2: The nonzero vectors Vi, . . .,Vm are linearly dependent if and only 
if one of them, say vi, is a linear combination of the preceding vectors: Vi = 
ait;i + • • • + <h-iVi-i. 

Suppose Vj = aj'Ui + ••• + aj-x-yj-i. Then 

aiVi + • • ■ + ai_iDi_i - Vi + Oi>j+i + • • • + Ov,n = 

and the coefficient of vi is not 0. Hence the Vj are linearly dependent. 

Conversely, suppose the Vj are linearly dependent. Then there exist scalars Oi, . . . , a^, not all 
0, such that ai^i + ■ • • + a^Vm = 0. Let k be the largest integer such that o^ ¥= 0. Then 

a^Vi + • • • + Ofc'Ufc + Ot)fc + i + • ■ • + Ov^ — or a^Vi + • • • + a^v^ — 

Suppose fe = 1; then a^Vi = 0, a^ ¥= and so v^ = 0. But the Vi are nonzero vectors; hence 
k > 1 and 

That is, -y^ is a linear combination of the preceding vectors. 

5.12. Prove Theorem 5.1: The nonzero rows Ri, . . .,Rn of a matrix in echelon form are 
linearly independent. 

Suppose {RntRn-i, ■ ■ ■,Ri} is dependent. Then one of the rows, say R^, is a linear combination 
of the preceding rows: 

^m — "m+l^m + l + «m + 2-'^m + 2 + ■"■ + <''n^n (*) 

Now suppose the fcth component of R^ is its first nonzero entry. Then, since the matrix is in echelon 
form, the fcth components of Rm+u ■ ■ -y^n ^^^ ^^ 0, and so the feth component of (*) is a^+i* + 
"m+2* + • • • + Ore* = 0. But this contradicts the assumption that the feth component of B„ is 
not 0. Thus i?i, . . .,Rn are independent. 

5.13. Suppose {vi, . . ., Vm} generates a vector space V. Prove: 

(i) If w GV, then {w, vi, . . ., Vm) is linearly dependent and generates V. 

(ii) If V{ is a linear combination of the preceding vectors, then [vi, . . .,Vi-uVi+i, . . ., Vm] 
generates V. 

(i) If w e y, then w is a linear combination of the Vj since {iij} generates V. Accordingly, 
{w, Vi, . . ., Vj^} is linearly dependent. Clearly, w with the Vj generate V since the Vf by them- 
selves generate V. That is, {w,Vi, . . .,v^} generates V. 



100 BASIS AND DIMENSION [CHAP. 5 

(ii) Suppose Vi = k^Vi + • • • + fei_i'yj_i. Let uGV. Since {Vii generates V, m is a linear com- 
bination of the Vi, say, u - a^Vi + • • ■ + a^v^. Substituting for Vj, we obtain 

u = ttiVj + • • • + ai_ii;j_i + aj(fciVi + • • • + ki^^Vi-i) + aj+iVj+i + • • ■ + a^Vm 
- {ai + aiki)vi + • • • + {ai^i + a^k^_l)v^_l + a^+iVi + i + • • ■ + a^Vm 

Thus {vi, . . ..t'i-i, 1^1+ 1, • • •,^'ot} generates V. In other words, we can delete Vj from the gen- 
erating set and still retain a generating set. 

5.14. Prove Lemma 5.4: Suppose {vi, . . .,Vn} generates a vector space V. If [wi, . . .,Wm} 
is linearly independent, then m — n and V is generated by a set of the form 
{wi, . . .,Wm,vi^, . . .,Vi^_^}. Thus, in particular, any » + l or more vectors in V are 
linearly dependent. 

It suffices to prove the theorem in the case that the V; are all not 0. (Prove!) Since the {Vj} 
generates V, we have by the preceding problem that 

{wi, Vi, ..., -yj {1) 

is linearly dependent and also generates V. By Lemma 5.2, one of the vectors in (1) is a linear com- 
bination of the preceding vectors. This vector cannot be Wj, so it must be one of the v's, say Vj. 
Thus by the preceding problem we can delete Vj from the generating set {!) and obtain the generating 

Now we repeat the argument with the vector Wj- That is, since (2) generates V, the set 

{Wi, W2, Vi, . ..,Vj-i,Vj+i, ..., vJ (5) 

is linearly dependent and also generates V. Again by Lemma 5.2, one of the vectors in (3) is a linear 
combination of the preceding vectors. We emphasize that this vector cannot be Wj or Wj since 
{wi, . . . , w^} is independent; hence it must be one of the v's, say v^. Thus by the preceding problem 
we can delete v^ from the generating set (3) and obtain the generating set 

{Wi, W2, Vi, . . ., ■Uj_i, Hj+i, . . ., Vfc_l, ■Vfc + l, . . ., -vJ 

We repeat the argument with Wg and so forth. At each step we are able to add one of the 
w's and delete one of the ■y's in the generating set. If m ^ n, then we finally obtain a generating 
set of the required form: 

{Wl, ..., Wm. V ■••'■"*n-m^ 

Lastly, we show that m > n is not possible. Otherwise, after n of the above steps, we obtain 
the generating set {wi, . . ., w„}. This implies that w„+i is a linear combination of Wj, . . ., w„ which 
contradicts the hypothesis that {wj} is linearly independent. 

5.15. Prove Theorem 5.3: Let 7 be a finite dimensional vector space. Then every basis of 
V has the same number of vectors. 

Suppose {ei,e2, ...,e„} is a basis of V, and suppose {fufi, • • •) is another basis of V. Since 
{e,} generates V, the basis {fi.fz, • • •} must contain n or less vectors, or else it is dependent by the 
preceding problem. On the other hand, if the basis {fiJ^, ■ ■ ■} contains less than n vectors, then 

{ej ej is dependent by the preceding problem. Thus the basis {fiJz, ■••} contains exactly n 

vectors, and so the theorem is true. 

5.16. Prove Theorem 5.5: Suppose {vi, ...,Vm} is a maximal independent subset of a set 
S which generates a vector space V. Then [vi, . ..,Vm} is a basis of V. 

Suppose weS. Then, since {vj} is a maximal independent subset of S, {■Ui, ...,■«„, w} is 
linearly dependent. By Problem 5.10, w is a linear combination of the i^j, that is, w G L(Vi). Hence 
S C L(Vi). This leads to V = L{S) C L(vi) C V. Accordingly, {vJ generates V and, since it is m- 
dependent, it is a basis of V. 



CHAP. 5] BASIS AND DIMENSION 101 

^ 5.17. Suppose V is generated by a finite set S. Show that V is of finite dimension and, in 
particular, a subset of S is a basis of V. 

Method 1. Of all the independent subsets of S, and there is a finite number of them since S is finite, 
one of them is maximal. By the preceding problem this subset of S is a basis of V. 

Method 2. If S is independent, it is a basis of V. If S is dependent, one of the vectors is a linear 
combination of the preceding vectors. We may delete this vector and still retain a generating set. 
We continue this process until we obtain a subset which is independent and generates V, i.e. is a 
basis of V. 



^ 5.18. Prove Theorem 5,6: Let V be of finite dimension n. Then: 
(i) Any set of n + 1 or more vectors is linearly dependent. 
(ii) Any linearly independent set is part of a basis, 
(iii) A linearly independent set with n elements is a basis. 

Suppose {ei, . . . , e„} is a basis of V. 

(i) Since {ei, ...,«„} generates V, any m + 1 or more vectors is dependent by Lemma 5.4. 

(ii) Suppose {v^, . . . , v,.} is independent. By Lemma 5.4, V is generated by a set of the form 

S = {vi, ...,Vr,e,^, ...,ei^_^} 

By the preceding problem, a subset of S^ is a basis. But S contains n elements and every basis 
of V contains n elements. Thus S is a basis of V and contains {vi, . . .,v^} as a subset. 

(iii) By (ii), an independent set T with n elements is part of a basis. But every basis of V contains 
n elements. Thus, T is a basis. 



5.19. Prove Theorem 5.7: Let Whe a subspace of an 7i-dimensional vector space V. Then 
dim W^n. In particular, if dim W = n, then W=V. 

Since V is of dimension n, any w + 1 or more vectors are linearly dependent. Furthermore, since 
a basis of W consists of linearly independent vectors, it cannot contain more than n elements. 
Accordingly, dim W — n. 

In particular, if {w^, . . . , w„} is a basis of W, then since it is an independent set with n elements 
it is also a basis of V. Thus W = V when dim W = n. 



5.20. Prove Theorem 5.8: dim(C7 + W) = dim U + dim W - dim{Ur\W). 

Observe that U nW is a subspace of both U and W. Suppose dim V — m, dim W = n and 
dim (Ur\W) = r. Suppose {v^, . . . , vj is a basis of UnW. By Theorem 5.6(ii), we can extend {■«{} 
to a basis of U and to a basis of W; say, 

{^1 Vr, Ml, ...,«„_,} and {vi, ..., v„M)i, ..., w„_,} 

are bases of U and W respectively. Let 

B - {Vi, ...,V„ Ml, . ..,«„_„ Wi, . . ., W„-r} 

Note that B has exactly 7111, + n — r elements. Thus the theorem is proved if we can show that B is a 
basis of U-\-W. Since {v^, Uj} generates U and {v^, w^) generates W, the union B = {vj, Uj, w^} 
generates U +W. Thus it suffices to show that B is independent. 

Suppose 

aiVi + • • • + a^v^ + 61M1 + • ■ • + 6„_,.M„_,. + Ci^i + • • • + c„_,.m;„_,. = (1) 

where Oj, bj, c^ stte scalars. Let 

V = aiVi + • • • + a^Vr + 61M1 + • • • + bn-^u^_^ (2) 



102 BASIS AND DIMENSION [CHAP. 5 

By (1), we also have that 



^n — r *^n — r 



(S) 



Since {vi,Uj} cU, v e U hy (2); and since {w J c TF, v €: W by (5). Accordingly, vGUnW. 
Now {Vi) is a basis of UnW and so there exist scalars di, . . . , d, fo"" which v = dj^i + • • • + d^v^. 
Thus by (^) we have 

diVi + • • ■ + d^V^ + CiWi + ■ • • + C„_rW„_r = 

But {vi, Wfc} is a basis of W and so is independent. Hence the above equation forces Cx = 0, . . . , 
c„_, = 0. Substituting this into (1), we obtain 

aiVi + • • • + a^Vr + bjUi + • • ■ + b^n-rUm-r — 

But {f j, Uj} is a basis of U and so is independent. Hence the above equation forces «! = 0, . . . , 
a, = 0, 61 = 0, . . . , 6to_^ = 0. 

Since the equation {1) implies that the aj, 6^ and c^ are all 0, B = {vi, Uj, w^} is independent 
and the theorem is proved. 

5.21. Prove Theorem 5.9: The row rank and the column rank of any matrix are equal. 
Let A be an arbitrary m X w matrix: 

jail ai2 ■ ■ ■ «in 

A _ I <*21 "'22 • • • ^2n 



[O'ml <'m2 • • • '^r 

Let Ri, B2 Bm denote its rows: 

i?j = (fflu, O12, • • • . <lin)> • • • > ^m — (<*ml> '*m2> • • • > '*mn) 

Suppose the row rank is r and that the following r vectors form a basis for the row space: 

Si = (611, 612, • ■ • , f>i„), S2 - (621. 622. • • • > ^2n). • • • . ^r = i^rV ^r2. • • • > *m) 

Then each of the row vectors is a linear combination of the S;: 

R^ ^^ ^11*^1 "■" '^i2'^2 ~r ' ' ' "T" /CjjOj. 
/?2 — 'l'21'S'l ^" fe22'^2 + ■ ■ ■ + ^2r"r 



fin» ~ "'ml"'l "^" "'m2"2 + " ' " + l^mr^r 

where the fcy are scalars. Setting the ith components of each of the above vector equations equal to 
each other, we obtain the following system of equations, each valid for i = 1, . . . , w: 

«li = ^^ll^li + ^12621 + • • ■ + ^Ir^ri 
ttgi = fe21*li + *'22&2i + • • ■ + k2rbri 



«m« — ^mlbii + fcm2*2i + ' " ' + k^rbr 



Thus for i = 1, . . . , w: 





In other words, each of the columns of A is a linear combination of the r vectors 

■I'l 




CHAP. 5] BASIS AND DIMENSION 103 

Thus the column space of the matrix A has dimension at most r, i.e. column rank — r. Hence, column 
rank — row rank. 

Similarly (or considering the transpose matrix A*) we obtain row rank — column rank. Thus 
the row rank and column rank are equal. 

BASIS AND DIMENSION 

5.22. Determine whether or not the following form a basis for the vector space R^: 

(i) (1, 1, 1) and (1, -1, 5) (iii) (1, 1, 1), (1, 2, 3) and (2, -1, 1) 

(ii) (1, 2, 3), (1, 0, -1), (3, -1, 0) (iv) (1, 1, 2), (1, 2, 5) and (5, 3, 4) 

and (2, 1, -2) 

(i) and (ii). No; for a basis of R3 must contain exactly 3 elements, since R^ is of dimension 3. 

(iii) The vectors form a basis if and only if they are independent. Thus form the matrix whose 
rows are the given vectors, and row reduce to echelon form: 

to 

The echelon matrix has no zero rows; hence the three vectors are independent and so form a 
basis for R^. 

(iv) Form the matrix whose rows are the given vectors, and row reduce to echelon form: 

to 







The echelon matrix has a zero row, i.e. only two nonzero rows; hence the three vectors are 
dependent and so do not form a basis for R3. 

5.23. Let W be the subspace of R* generated by the vectors (1, —2, 5, -3), (2, 3, 1, —4) and 
(3, 8, —3, —5). (i) Find a basis and the dimension of W. (ii) Extend the basis of W 
to a basis of the whole space R*. 

(i) Form the matrix whose rows are the given vectors, and row reduce to echelon form: 

/l -2 5 3\ 
to 7-9 2 to 




1 


-2 


5 


3 





7 


-9 


2 





14 


-18 


4 



1 


-2 


5 


-3 





7 


-9 


2 















The nonzero rows (1, —2, 5, —3) and (0, 7, —9, 2) of the echelon matrix form a basis of the row 
space, that is, of W. Thus, in particular, dim W = 2. 

(ii) We seek four independent vectors which include the above two vectors. The vectors (1, —2, 5, —3), 
(0, 7, —9, 2), (0, 0, 1, 0) and (0, 0, 0, 1) are independent (since they form an echelon matrix), and 
so they form a basis of R* which is an extension of the basis of W. 

5.24. Let W be the space generated by the polynomials 

V2 = 2«3 -st^' + Qt-l Vi = 2t^ ~5P + 7t + 5 
Find a basis and the dimension of W. 

The coordinate vectors of the given polynomials relative to the basis {f3, t^, t, 1} are respectively 
[vi] = (1,-2,4,1) K] = (1,0,6,-5) 

[v^] = (2,-3,9,-1) K] = (2,-5,7,5) 



104 BASIS AND DIMENSION [CHAP. 5 

Form the matrix whose rows are the above coordinate vectors, and row reduce to echelon form: 





1 


-2 


4 


1 





1 


1 


-3 



























to I „ _ to 



The nonzero rows (1, —2, 4, 1) and (0, 1, 1, —3) of the echelon matrix form a basis of the space 
generated by the coordinate vectors, and so the corresponding polynomials 

«3 _ 2*2 + 4t + 1 and t^ + t - 3 

form a basis of W. Thus dim W = 2. 

5.25. Find the dimension and a basis of the solution space W of the system 

x + 2y + 2z- s + St = 

x + 2y + 3z + s + i = 

Sx + 6y + 8z + s + 5t = 

Reduce the system to echelon form: 

X + 2y + 2z - s + 3t = 

^ x + 2y + 2z- s + 3t = 

z + 2s -2t = or 

z + 2s - 2t = 
2z + 4s - 4« = 

The system in echelon form has 2 (nonzero) equations in 5 unknowns; hence the dimension of the 
solution space PT is 5 — 2 = 3. The free variables are y, s and t. Set 

(i) y = l,s = 0,t-0, (ii) J/ = 0, s = 1, t = 0, (iii) y = 0, s^O, t-l 

to obtain the respective solutions 

vi = (-2, 1, 0, 0, 0), V2 = (5, 0, -2, 1, 0), ^3 = (-7, 0, 2, 0, 1) 

The set {v^, V2, f 3} is a basis of the solution space W. 

5.26. Find a homogeneous system whose solution set W is generated by 

{(1, -2, 0, 3), (1, -1, -1, 4), (1, 0, -2, 5)} 

Method 1. Let v = (x, y, z, w). Form the matrix M whose first rows are the given vectors and whose 
last row is v; and then row reduce to echelon form: 

/I -2 3\ /I -2 3 \ /I -2 3 

„ _ I 1 -1 -1 4 r 1 -1 1 ,.0 1 -1 1 

IO-25II0 2 -2 2 \^\(iQ2x + y + z-hx-y + w^ 



\x y 



\Q 2x + y z -3x + w/ \0 



The original first three rows show that W has dimension 2. Thus v&WH and only if the addi- 
tional row does not increase the dimension of the row space. Hence we set the last two entries 
in the third row on the right equal to to obtain the required homogeneous system 

2x + y + z =0 

5x + y —w = 

Method 2. We know that v = (x,y,z,w) G P7 if and only if d is a linear combination of the gen- 
erators of W: , „ „ „ ,.v 
(x, y, z, w) = r(l, -2, 0, 3) + 8(1, -1, -1, 4) + t(l, 0, -2, 5) 

The above vector equation in unknowns r, s and t is equivalent to the following system: 



CHAP. 5] BASIS AND DIMENSION 105 

r + s + t = X r + 8 + t = oe r + s + t = x 

-2r -s = y 8 + 2t = 2x + y 8 + 2t = 2x + y 

-s-2t = z °^ -s-2t = z °^ = 2x + y + z ^^^ 

3r + 4s + 5t = w 8 + 2t = w — 3x = 5x + y — w 

Thus V E. W ii and only if the above system has a solution, i.e. if 

2x + y + z =0 

5x + y — w = 

The above is the required homogeneous system. 

Remark: Observe that the augrmented matrix of the system (1) is the transpose of the matrix 
M used in the first method. 

5.27. Let U and W be the following subspaces of R*: 

U = {{a,b,c,d): b + c + d = 0}, W = {{a,b,c,d): a + b = 0, c = 2d} 
Find the dimension and a basis of (i) U, (ii) W, (iii) UnW. 

(i) We seek a basis of the set of solutions (a, 6, c, d) of the equation 

b + c + d = or 0-a + b + c + d = 
The free variables are a, c and d. Set 

(1) a = 1, c = 0, d = 0, (2) a = 0, c = 1, d = 0, (3) a = 0, c = 0, d = 1 
to obtain the respective solutions 

^•l = (1,0,0,0), V2 = (0,-1,1,0), 1^3 = (0,-1,0,1) 
The set {v^, v^, v^ is a basis of U, and dim U = 3. 

(ii) We seek a basis of the set of solutions (a, 6, c, d) of the system 

a + 6 = a + 6 = 

or 
c = 2d c - 2d = 

The free variables are 6 and d. Set 

(i) 6 = 1, d = 0, (2) 6 = 0, d = 1 

to obtain the respective solutions 

Vy = (-1, 1, 0, 0), V2 = (0, 0, 2, 1) 

The set {■Ui, v^ is a basis of W, and dim W = 2. 

(iii) U r^W consists of those vectors (a, b, c, d) which satisfy the conditions defining U and the con- 
ditions defining W, i.e. the three equations 

6+c+d=0 a+b =0 

0+6 =0 or 6 + c+d = 

c = 2d c - 2d = 

The free variable is d. Set d = 1 to obtain the solution v = (3, -3, 2, 1). Thus {v} is a basis 
of VnW, and Aim (V nW) = \. 



5.28. Find the dimension of the vector space spanned by: 

(i) (1, -2,3, -1) and (1, 1, -2, 3) (v) {\ ^J and ^ 

(ii) (3, -6, 3, -9) and (-2, 4, -2, 6) / 1 l\ /-3 -3 

(iii) t^ + 2f2 + 3i + 1 and 2t^ + 4*^ + 6f + 2 ^^^^ (^-i -i^ 

(iv) i»-2f2 + 5 and i^ + 34 _ 4 (vii) 3 and -3 



Oj \y zj \0 



106 BASIS AND DIMENSION [CHAP. 5 

Two nonzero vectors span a space W of dimension 2 if they are independent, and of dimension 
1 if they are dependent. Recall that two vectors are dependent if and only if one is a multiple of 
the other. Hence: (i) 2, (ii) 1, (iii) 1, (iv) 2, (v) 2, (vi) 1, (vii) 1. 

5.29. Let V be the vector space of 2 by 2 symmetric matrices over K. Show that 
dim 7 = 3. (Recall that A = (ay) is symmetric iff A = A* or, equivalently, an = an.) 

I a h\ 
An arbitrary 2 by 2 symmetric matrix is of the form A = ( 1 where a, 6, c G A. 

(Note that there are three "variables".) Setting ^ 

(i) a = 1, 6 = 0, c = 0, (ii) a = 0, 6 = 1, c = 0, (iii) a = 0, 6 = 0, c = l 
we obtain the respective matrices 

-. = (;:)• -. = (?;)■ -• = (::) 

We show that {E^, E^, E^) is a basis of V, that is, that it (1) generates V and (2) is independent. 

(1) For the above arbitrary matrix A in V, we have 

A = ("' ^\ = aEi + bE2 + cEs 
Thus {^1, E^, Eg} generates V. 

(2) Suppose xEi + yE^ + zE^ = 0, where x, y, z are unknown scalars. That is, suppose 

Ki o) + ^(i J) + K2 1) - 

Setting corresponding entries equal to each other, we obtain x = 0, y = 0, z = 0- In other words, 

xEi + yE2 + zEs = implies x = 0, y = 0, z = 
Accordingly, {EijEz.Ea} is independent. 

Thus {El, E2, E3} is a basis of V and so the dimension of V is 3. 

5.30. Let V be the space of polynomials in t of degree ^ n. Show that each of the following 
is a basis of V: 

(i) {1, «, t^ . . . , «"->, «»}, (ii) {1, 1 - 1, (1 - tf, . . . , (1 - «)"-s (1 - «)"}. 

Thus dim V = n + 1. 

(i) Clearly each polynomial in y is a linear combination of l,t, ...,t"-i and t». Furthermore, 
1, t, . . ., t"-i and t» are independent since none is a linear combination of the preceding poly- 
nomials. Thus {1, t t«} is a basis of V. 

(ii) (Note that by (i), dim V = w+ 1; and so any m-l- 1 independent polynomials form a basis of 

V.) Now each polynomial in the sequence 1, 1 - t (1 - t)» is of degree higher than the 

preceding ones and so is not a linear combination of the preceding ones. Thus the w + 1 poly- 
nomials 1, 1 — t, . . . , (1 — t)» are independent and so form a basis of V. 

5.31. Let V be the vector space of ordered pairs of complex numbers over the real field R 
(see Problem 4.42), Show that V is of dimension 4. 

We claim that the following is a basis of V: 

B = {(1, 0), (i, 0), (0, 1), (0, t)} 

Suppose u e V. Then v = («, w) where z, w are complex numbers, and so v = (a+bi,e + di) where 
o, 6, c, d are real numbers. Then 

V = ail, 0) + 6(t, 0) + c(0, 1) + d(0, t) 
Thus B generates V. 



CHAP. 5] 



BASIS AND DIMENSION 



107 



The proof is complete if we show that B is independent. Suppose 
xi(l, 0) + X2(i, 0) + a;3(0, 1) + 0:4(0, i) = 
where Xi, x^, x^, X4 G R. Then 

{xi + x^, Xs + x^x) = (0, 0) and so 



Xx -'r x^i = 
«3 -V x^i — Q 

Accordingly x-^ = 0, a;2 = 0, as = 0, a;4 = and so B is independent. 

5.32. Let Y be the vector space of m x w matrices over a field K. Let E^ G V be the matrix 
with 1 as the i^-entry and elsewhere. Show that {E-,^ is a basis of 7. Thus 
dimV = mn. 

We need to show that {By} generates V and is independent. 

Let A = (tty) be any matrix in V. Then A = 2 o,^E^^. Hence {By} generates V. 

Now suppose that 2 ajjii^ij = where the «{.- are scalars. The y-entry of 2 «ij^ij is ««. and 

the y-entry of is 0. Thus asy = 0, i = 1, . . ., w, j = 1 m. Accordingly the matrices By are 

independent. 

Thus {By} is a basis of V. 

Remark: Viewing a vector in K» as a 1 X w matrix, we have shown by the above result that the 
usual basis defined in Example 5.3, page 88, is a basis of X" and that dim K" = w. 

SUMS AND INTERSECTIONS 

5.33. Suppose TJ and W are distinct 4-dimensional subspaces of a vector space Y of dimen- 
sion 6. Find the possible dimensions of TJV^W. 

Since V and W are distinct, V -VW properly contains 17 and W; hence dim(f7+W)>4. 
But dim(?7+W) cannot be greater than 6, since dimV = 6. Hence we have two possibilities: 
(i) dim(U+T7) = 5, or (ii) dim (U + PF) = 6. Using Theorem 5.8 that dim(f7+ T^) = dim U + 
dim W — dim (Un TF), we obtain 

(i) 5 = 4 + 4 -dim(f/nW) or dim(t7nW) = 3 

(ii) 6 = 4 + 4 -dim(?7nW) or dim(t/nTF) = 2 
That is, the dimension of TJ r\'W must be either 2 or 3. 



5.34. Let J] and W be the subspaces of R* generated by 

{(1, 1, 0, -1), (1, 2, 8, 0), (2, 3, 3, -1)} and {(1, 2, 2, -2), (2, 3, 2, -3), (1, 3, 4, -3)} 
respectively. Find (i) dim (C/ + TF), (ii) dim(C7nW). 

(i) TJ-^W is the space spanned by all six vectors. Hence form the matrix whose rows are the 
given six vectors, and then row reduce to echelon form: 




to 




to 




to 




Since the echelon matrix has three nonzero rows, dim iJJ -VW) — Z. 



108 



BASIS AND DIMENSION 



[CHAP. 5 



(ii) First find dim U and dim W. Form the two matrices whose rows are the generators of U and 
W respectively and then row reduce each to echelon form: 



and 



1 


1 





-1 


1 


2 


3 





2 


3 


3 


-1 


1 


2 


2 


-2 


2 


3 


2 


-3 


1 


3 


4 


-3 



to 



to 



to 



to 



1 


1 





-1 





1 


3 


1 











0. 


'l 


2 


2 


-2 





-1 


-2 


1 















Since each of the echelon matrices has two nonzero rows, dim V — 2 and dim W = 2. Using 
Theorem 5.8 that dim (V +W) = dim U + dim W - dim (UnW), we have 

3 = 2 +2-dim(!7nW) or Aim{Ur\W) = 1 



5.35. Let U be the subspace of R^ generated by 

{(1, 3, -2, 2, 3), (1, 4, -3, 4, 2), (2, 3, -1, -2, 9)} 

and let W be the subspace generated by 

{(1, 3, 0, 2, 1), (1, 5, -6, 6, 3), (2, 5, 3, 2, 1)} 

Find a basis and the dimension of (i) XJ + W, (ii) f/n W. 

(i) U +W is the space generated by all six vectors. Hence form the matrix whose rows are the 
six vectors and then row reduce to echelon form: 

/I 3-2 2 3\ 
1-1 2-1 
0-3 3-6 
*° ' 2 0-2 
2-440 

\0 -1 7 -2 -5/ 



to 



1 


3 


-2 


2 


3 


1 


4 


-3 


4 


2 


2 


3 


-1 


-2 


9 


1 


3 





2 


1 


1 


5 


-6 


6 


3 


2 


5 


3 


2 


1 


1 


3 


-2 


2 


3 





1 


-1 


2 


-1 























2 





-2 








-2 





2 








6 





-6 



to 



1 


3 


-2 


2 


3 





1 


-1 


2 


-1 








2 





-2 
















































The set of nonzero rows of the echelon matrix, 

{(1, 3, -2, 2, 3), (0, 1, -1, 2, -1), (0, 0, 2, 0, -2)} 
is a basis oiV+W; thus dim (t/ + TF) = 3. 

(ii) First find homogeneous systems whose solution sets are U and W respectively. Form the matrix 
whose first rows are the generators of U and whose last row is (», y, z, s, t) and then row reduce 
to echelon form: 




-2 

-1 

3 



2 

2 

-6 



3 

-1 

3 



-3x + y 2x + z -2x + s Sx + t j 



4a; 




CHAP. 5] 



BASIS AND DIMENSION 



109 



Set the entries of the third row equal to to obtain the homogeneous system whose solution 

set is U: 

-X + y + z = Q, 4a; - 22/ + 8 = 0, -6a; + y -\- t = 

Now form the matrix whose first rows are the generators of W and whose last row is 
(x, y, z, 8, t) and then row reduce to echelon form: 




to 





-9aj + 3y + z 'Ix — 2y + s 2x — y + t 




Set the entries of the third row equal to to obtain the homogeneous system whose solution 

—9x + 3y + z = 0, 4x — 2y + s = 0, 2x - y + t = 
Combining both systems, we obtain the homogeneous system whose solution set is U nW: 



—x+y + z =0 

4x -2y +8 =0 

-6a; + y + t = 

-9x + 3y + z - 

4a; — 2j/ + s =0 

2a; - J/ + t = 

—x + y+z =0 

2y + 4z + 8 =0 

8z + 58 + 2t - 

4z + 3s =0 

s - 2t = 



—x+y+z — 

2y + Az + s - 

-5y - 6z + t = 

-6y - 8z =0 

2y + iz + s =0 

^ + 2z + t = 

-x + y+z =0 

2y + Az + 8 = 

8z + 5s + 2f = 

8 - 2t = 



There is one free variable, which is t; hence dim(l7nT^ = 1. Setting t = 2, we obtain the 
solution a; = 1, 2/ = 4, z = -3, 8 = 4, t = 2. Thus {(1, 4, -3, 4, 2)} is a basis of UnW. 



COORDINATE VECTORS 

5^6. Find the coordinate vector of v relative to the basis {(1, 1, 1), (1, 1, 0), (1, 0, 0)} of R^ 
where (i) v = (4, -3, 2), (ii) v = (a, 6, c). 

In each case set v aa a linear combination of the basis vectors using unknown scalars x, y and z: 

V = a;(l, 1, 1) + j/(l, 1, 0) + z(l, 0, 0) 

and then solve for the solution vector {x,y,z). (The solution is unique since the basis vectors are 
linearly independent.) 

(i) (4,-3,2) = a;(l, 1, 1) + j/(l, 1, 0) + z(l, 0, 0) 

= (a;, X, x) + {y, y, 0) + (z, 0, 0) 

= (x + y + z,x + y,x) 

Set corresponding components equal to each other to obtain the system 

x + y + z = A, X + y = —3, a; = 2 

Substitute « = 2 into the second equation to obtain y = —5; then put x = 2, y = —5 into 
the first equation to obtain z = 7. Thus x = 2, y = -5, z = 7 is the unique solution to the 
system and so the coordinate vector of v relative to the given basis is [v] = (2, —5, 7). 



110 BASIS AND DIMENSION [CHAP. 5 

(ii) {a, b, c) - «(1, 1, 1) + 1/(1, 1, 0) + z{l, 0, 0) = (x + y + z,x + y,x) 

Then x + y + z = a, x + y = b, x — c 

from which x = c, y — b — c, z = a—b. Thus [v] — (c,b — c,a— b), that is, [(a, b, c)] — 
{e, b — c, a — b). 

5JS7. Let V be the vector space of 2 x 2 matrices over R. Find the coordinate vector of the 
matrix A GV relative to the basis 

{{I iHri)^{i-iHi I)} w- -(!-? 

Set A as a linear combination of the matrices in the basis using unknown scalars x, y, z, w: 



- C J) 



"I i)*'(:-i)-'(i-i)*<i : 



\X x) '^ \y 0/ \0 0/ 



w 




' X + z + w X — y — z^ 
X + y X 

Set corresponding entries equal to each other to obtain the system 

X + z + w = 2, X — y — z = 3, x + y =^ A, x — —1 

from which x = -l, y = ll, z- -21, w = 30. Thus [A] - (-7, 11, -21, 30). (Note that the co- 
ordinate vector of A must be a vector in R* since dim V = 4.) 



5^8. Let W be the vector space of 2 x 2 symmetric matrices over R. (See Problem 5.29.) 

Find the coordinate vector of the matrix ^ — [ , ^ „ ) relative to the basis 

1 -2\ /2 1\ / 4 -1\1 ^ ^ 



-2 1/' vi sy \-i -5/j • 

Set A as a linear combination of the matrices in the basis using unknown scalars x, y and z: 
/ 4 -11\ /I -2\ , /2 1\ , / 4 -1\ /x + 2y + 4:Z -2x + y - z 

^ = (_n -7J = %-2 ij + ^l 3J + %-l -5) = (,-2x + y-. . + 3,-5. 

Set corresponding entries equal to each other to obtain the equivalent system of linear equations 
and reduce to echelon form: 



X + 2y + Az = 4 
—2x + y - z — -11 
-2x + y - z = -11 

X + Sy - 5z = -7 



X + 2y + iz = 4 X + 2y + iz - 4 

5y + 7z - -3 or 5y + Iz - -3 

J/ - 9z = -11 52z - 52 



We obtain « = 1 from the third equation, then y = —2 from the second equation, and then a = 4 
from the first equation. Thus the solution of the system is x = 4, y — —2, z — 1; hence [A] — 
(4, —2, 1). (Since dim W = 3 by Problem 5.29, the coordinate vector of A must be a vector in K*.) 

5.39. Let {ei, 62, es} and {/i, h, fa} be bases of a vector space V (of dimension 3). Suppose 

ei = Ci/i + ttaA + 0-3/3 

62 — hifi+b^fi + hifz (i) 

63 = Ci/i + C2/2 + C3/3 

Let P be the matrix whose rows are the coordinate vectors of ei, ea and es respectively, 
relative to the basis {fi}: 



CHAP. 5] 



BASIS AND DIMENSION 



111 



{tti 02 as \ 
61 &2 63 
Cl C2 C3 ^ 

Show that, for any vector v GV, [v]eP = [vy. That is, multiplying the coordinate 
vector of v relative to the basis {ei} by the matrix P, we obtain the coordinate vector 
of V relative to the basis {/<}. (The matrix P is frequently called the change of basis 
matrix,) 

Suppose V = rei + seg + teg; then [v]^ = (r,8,t). Using (i), we have 

V = r(aJi + O2/2 + agfa) + si^ifi + ^2/2 + ^3/3) + *(«i/i + "2/2 + "3/3) 
= (roi + s6i + tci)/i + {ra2 + S62 + fc2)/2 + {ras + sb^ + tcs)^ 

Hence [v]f — {rai + sbi + tc^, ra2 + sb2+ tC2, ra3 + sb3+ tcg) 

On the other hand, / „ 

ai a2 (I3 

[v],P = (r, s, t) 61 62 63 

\ Cl C2 C3 

= (rtti + s6i + tCj, ra2 + 862 + tc2, ra^ + S63 + tcg) 
Accordingly, [v\eP — ['"]/• 

Remark: In Chapter 8 we shall write coordinate vectors as column vectors rather than row 
vectors. Then, by above, 

'ax 61 Ci\/r\ / rai + s6i + tCx ' 



QMe = I a2 62 «2l|S 

&3 Cg/ \ t , 



ra2 + 362 + *C2 

irfflg + S&3 + tCgy 



f 



where Q is the matrix whose columns are the coordinate vectors of ©i, e^ ^^^d H respectively, relative 
to the basis {/J. Note that Q is the transpose of P and that Q appears on the left of the column 
vector \v\g whereas P appears on the right of the row vector [1;]^. 



RANK OF A MATRIX 

5.40. Find the rank of the matrix A where: 



(i) A = 



/I 


3 


1 


-2 


-3 


u 


4 


3 


-1 


-4 


l^ 


3 


-4 


-7 


-3 


\3 


8 


1 


-7 


-8 



(ii) A = 




(iii) A 




(i) Row reduce to echelon form: 



/I 
1 

2 

3 



to 



3 1-2 -3\ 

4 3-1-4 
3 -4 -7 -3 
8 1-7 -8/ 

Since the echelon matrix has two nonzero rows, rank (A) = 2 




to 




(ii) Since row rank equals column rank, it is easier to form the transpose of A and then row 
reduce to echelon form: 




to 




to 




Thus rank (A) = 3. 



112 BASIS AND DIMENSION [CHAP. 5 

(iii) The two columns are linearly independent since one is not a multiple of the other. Hence 
rank (A) = 2. 



5.41. Let A and B be arbitrary matrices for which the product AB is defined. Show that 
rank (AB)^ rank (J?) and rank (AB) ^ rank (A). 

By Problem 4.33, page 80, the row space of AB is contained in the row space of B; hence 
rank (AB) — rank (B). Furthermore, by Problem 4.71, page 84, the column space of AB is contained 
in the column space of A; hence rank (AB) — rank (A). 



5.42. Let A be an n-square matrix. Show that A is invertible if and only if rank (A) = n. 

Note that the rows of the w-square identity matrix /„ are linearly independent since /„ is in 
echelon form; hence rank (/„) = n. Now if A is invertible then, by Problem 3.36, page 57, A is row 
equivalent to /„; hence rank (A) = n. But if A is not invertible then A is row equivalent to a matrix 
with a zero row; hence rank (A) < n. That is, A is invertible if and only if rank (A) = n. 



5.43. Let JCij, Xi^, . . . , a;i^ be the free variables of a homogeneous system of linear equations 
with n unknowns. Let Vj be the solution for which: x^ — 1, and all other free varia- 
bles = 0. Show that the solutions Vi, V2, . . .,Vk are linearly independent. 

Let A be the matrix whose rows are the Vi respectively. We interchange column 1 and column 
ii, then column 2 and column ^2, . . ., and then column k and column i^l and obtain the kXn matrix 



1 





. 


. 





"l.k + l ■ 


■ Ci„ 





1 


. 


. 





''2, k + 1 


• C2„ 



= (/, C) 

\o ... 1 Cfc,fc + i ... c„„/ 

The above matrix B is in echelon form and so its rows are independent; hence rank (B) = k. Since 
A and B are column equivalent, they have the same rank, i.e. rank (A) = k. But A has k rows; 
hence these rows, i.e. the iij, are linearly independent as claimed. 



MISCELLANEOUS PROBLEMS 

5.44. The concept of linear dependence is extended to every set of vectors, finite or infinite, 
as follows: the set of vectors A = {vi} is linearly dependent iflf there exist vectors 
Vi^, . . .,Vi^G A and scalars ai, . . . , a„ S Z, not all of them 0, such that 

aiVi^ + aiVi^ + • ■ • + a„i7i„ = 

Otherwise A is said to be linearly independent. Suppose that Ai, A2, . . . are linearly 
independent sets of vectors, and that AiCAzC---. Show that the union A = 
A1UA2U • • • is also linearly independent. 

Suppose A is linearly dependent. Then there exist vectors v^ v^& A and scalars «!,..., 

o„ G K, not all of them 0, such that 

a^Vi + a2V2 + • • • + a^Vn = (1) 

Since A = uAj and the ■Wj S A, there exist sets A,, . . ., A;^ such that 

v^eAi^, v^eAi^, ..., v„eAi^ 

Let k be the maximum index of the sets Aj.: k = max (ij, . . . , i„). It follows then, since Ai C Ag c • ■ • , 
that each Aj. is contained in A^. Hence •Ui,'y2, • ■ -.-yn 6 -^k and so, by {1), A^ is linearly dependent, 
which contradicts our hypothesis. Thus A is linearly independent. 



CHAP. 5] BASIS AND DIMENSION 113 

5.45. Consider a finite sequence of vectors S — {vuV2, . . .,v„}. Let T be the sequence of 
vectors obtained from S by one of the following "elementary operations": (i) inter- 
change two vectors, (ii) multiply a vector by a nonzero scalar, (iii) add a multiple of 
one vector to another. Show that S and T generate the same space W. Also show 
that T is independent if and only if S is independent. 

Observe that, for each operation, the vectors in T are linear combinations of vectors in S. On 
the other hand, each operation has an inverse of the same type (Prove!); hence the vectors in S are 
linear combinations of vectors in T. Thus S and T generate the same space W. Also, T is inde- 
pendent if and only if dim W — n, and this is true iff S is also independent. 

5.46. Let A — (ay) and B = (by) be row equivalent mXn matrices over a field K, and let 
vi, . . .,Vn be any vectors in a vector space V over K. Let 

Ml = ttiiVi + anVi + •••-!- ainVn Wi = 6iit;i -I- bi2V2 +•••-!- binVn 

Ui — a^iVl + a22V2 + • • • + a2nVn W2 = b2lVl + b22V2 -I- • • • -I- &2nVn 



Um = OmlVl + am2V2 + • • • "I" OmnVn Wm = bmlVl + bm2V2 + • • • + bmnVn 

Show that [Vd] and (w.} generate the same space. 

Applying an "elementary operation" of the preceding problem to {mJ is equivalent to applying 
an elementary row operation to the matrix A. Since A and B are row equivalent, B can be obtained 
from A by a sequence of elementary row operations; hence {tyj can be obtained from {mJ by the 
corresponding sequence of operations. Accordingly, {mj} and {wj generate the same space. 

5.47. Let Vi, . . .,Vn belong to a vector space V over a field K. Let 

Wi = aiiVi -I- ai2V2 + • • ■ + ainVn 
W2 = a2iVi + a22V2 +•••-!- a2nVn 



Wn — dnlVl + an2V2 "I" ■ • • "I" annVn 

where Oij S K. Let P be the n-square matrix of coefficients, i.e. let P = {an). 

(i) Suppose P is invertible. Show that {wi) and {vi} generate the same space; hence 
{wi) is independent if and only if {vi) is independent. 

(ii) Suppose P is not invertible. Show that {wi) is dependent. 

(iii) Suppose {wi) is independent. Show that P is invertible. 

(i) Since P is invertible, it is row equivalent to the identity matrix /. Hence by the preceding 
problem {wj and {vi^ generate the same space. Thus one is independent if and only if the 
other is. 

(ii) Since P is not invertible, it is row equivalent to a matrix with a zero row. This means that 
{wit generates a space which has a generating set of less than n elements. Thus {wj is 
dependent. 

(iii) This is the contrapositive of the statement of (ii), and so it follows from (ii). 

5.48. Suppose V is the direct sum of its subspaces U and W, i.e. V = II®W. Show that: 
(i) if (Ml, . . .,tfot}cC/ and {wi, . . . , Wn) CW are independent, then {Vn,Wj) is also 
independent; (ii) dim V = dim U + dim W. 

(i) Suppose a^Ui + • • • + (i„,m„, + 6iWi +•••-!- b„w„ = 0, where ttj, bj are scalars. Then 

= (ajMi -I- • ■ • -h ttrnWrn) + (^I'^l + • • • + b„Wn) = + 



114 BASIS AND DIMENSION [CHAP. 5 

where 0, a^u^ + • • • + a^Mm ^ U and 0, h^Wi + • • • + 6n'"'n ^ '^^ Since such a sum for is 
unique, this leads to 

axUi + • • • + a„Mm = 0, bjWi + • • • + h^w^^ — 

The independence of the tij implies that the Oj are all 0, and the independence of the Wj implies 
that the 6j- are all 0. Consequently, {mj, Wj} is independent. 

(ii) Method 1. Since V -U ®W, we have V =U+W and Ur\W - {0}. Thus, by Theorem 
5.8, page 90, 

dimy = dimC7 + diml?F - dim(f/nW') = dim 1/ + dim W - = dim U + dim T^ 

Method 2. Suppose {u^, ...,u^) and {t«i, . . . , w^ are bases of U and W respectively. Since 
they generate TJ and W respectively, {mj, Wj) generates V = V +W. On the other hand, by 
(i), (mj, Wj) is independent. Thus {u^, w^ is a basis of V; hence dim V = dim ?7 + dim W. 

5.49. Let [/ be a subspace of a vector space V of finite dimension. Show that there exists 
a subspace W oiV such that F = U®W. 

Let {mj, . . . , m,} be a basis of U. Since {mJ is linearly independent, it can be extended to a 

basis of V, say, {mi, . . . , m„ ^i wJ. Let W be the space generated by {w^, . . . , w,}. Since 

{Mj,Wi} generates V, V = V + W. On the other hand, C/ n H' = {0} (Problem 5.62). Accordingly, 
y = ?7 © T17. 

5.50. Recall (page 65) that if X is a subfield of a field E (or: £? is an extension of K), then 
E may be viewed as a vector space over K. (i) Show that the complex field C is a 
vector space of dimension 2 over the real field R. (ii) Show that the real field R is a 
vector space of infinite dimension over the rational field Q. 

(i) We claim that {l,i} is a basis of C over R. For if v&C, then v = a + hi = a'l-\-b'i 
where a, 6 e K; that is, {1, i} generates C over R. Furthermore, if x'l + y'i = Q or 
x + yi - 0, where a;,j/ S R, then a; = and y = 0; that is, {l,i} is linearly independent 
over R. Thus {1, t} is a basis of C over R, and so C is of dimension 2 over R. 

(ii) We claim that, for any n, {1, Tr.tr^, . . ., tj-"} is linearly independent over Q. For suppose 
aol + ffliff + a^ifi + ■ • • + a„7r" = 0, where the aj G Q, and not all the aj are 0. Then ;r is a 
root of the following nonzero polynomial over Q: a^ + a^x + ajOi^ + • • • + o„x". But it can be 
shown that tt- is a transcendental number, i.e. that ir is not a root of any nonzero polynomial 
over Q. Accordingly, the w+1 real numbers 1, jr, jt-^, . . .,jr" are linearly independent over Q. 
Thus for any finite n, R cannot be of dimension n over Q, i.e. R is of infinite dimension over Q. 

5.51. Let 2«: be a subfield of a field L and L a subfield of a field E: KgLcE. (Hence K is a 
subfield of E.) Suppose that E is of dimension n over L and L is of dimension m 
over K. Show that E is of dimension mn over K. 

Suppose {vi, . . .,i;„} is a basis of E over L and {a^, . . .,a™} is a basis of L over K. We claim 
that {aiVj-. i = l,...,m,j = l,...,n} is a basis of E over K. Note that {aji^j} contains mn 
elements. 

Let w be any arbitrary element in E. Since {i;i, ...,v„} generates E over L, w is a linear com- 
bination of the ■Uj with coefficients in L: 

W = 6i1)i + b2V2 + • • • + fen'Wn. &{ S ^ W 

Since {oi, . . . , a^) generates L over K, each 64 S L is a linear combination of the a,- with co- 
efficients in K: 

61 = fciitti + fejafta + • • ■ + fcim^m 



bn - '^nl«l + '^n2«2 + ' " ' + Km"'n 



CHAP. 5] BASIS AND DIMENSION 115 

•where k^ e K. Substituting in (1), we obtain 

w = (kiitti H + ki^ajvi + (fcaitti + • • • + A;2„o„,)v2 + • • • + (Kia^ + • • • + k„^ajv„ 

- ^^U"!"! + •• • • + ki^amVl + 'C21«l'"2 + " * ' + fe2m«m''2 + ' " " + k^iaiV^ + • • • + fc„„a„V„ 

i.} 

where fc^j e K. Thus w is a linear combination of the ajVj with coefficients in K; hence {ojVj} gen- 
erates E over X. 

The proof is complete if we show that {a^Vj} is linearly independent over K. Suppose, for scalars 
Xj, e X, 2 «n(aiVj) = 0; that is, 

{xiidiVi + Xi2a2Vi + • • • + Xi,„a„,Vi) + • • • + {x^iaiV^ + x^^ch.i'n + • ■ • + a;„ma„v„) = 

or (ajiitti + Kiattj + • • • + Xi,^a„)vi + • • ■ + (a;„ieii + x„2a2 + • • • + x„^a„)v„ = 

Since {vi, . . ., v„} is linearly independent over L and since the above coefficients of the Vj belong to 
L, each coefficient must be 0: 

Kiiffli + Xi2a2 + ■ • • + a;im«m = 0, . . ., x^^Ui + x„2a2 + • • • + »„„«„ = 

But {ffli, . . . , «„} is linearly independent over K; hence since the ar^i £ K, 

«ii = 0, Xi2 = »im = 0, ..., «„! = 0, a;„2 = 0, ..., a;„m = 

Accordingly, {ttjVj} is linearly independent over K and the theorem is proved. 



Supplementary Problems 

LINEAR DEPENDENCE 

5.52. Determine whether u and v are linearly dependent where: 
(i) u = (1, 2, 3, 4), V = (4, 3, 2, 1) (iii) m = (0, 1), v = (0, -3) 
(ii) u = (-1, 6, -12), V = (|, -3, 6) (iv) u = (1, 0, 0), v = (0, 0, -3) 

(vii) u = -fi + |t2 - 16, V = ^f3 - :^t2 + 8 (viii) m = t^ + 3t + 4^ -y = t^ + 4t + 3 

5.53. Determine whether the following vectors in R* are linearly dependent or independent: (i) (1, 3, —1, 4), 

(3,8,-5,7), (2,9,4,23); (ii) (1,-2,4,1), (2, 1,0,-3), (3, -6,1,4). 

5.54. Let V be the vector space of 2 X 3 matrices over R. Determine whether the matrices A,B,C B V 
are linearly dependent or independent where: 

1 -2 3\ „ _ /I -1 4\ /3 -8 7 

4 5 -2 7 ' I 2 10 -1 



<"'^=a4-:). -(4 I'D- ^=(r4-s 



5.55. Let V be the vector space of polynomials of degree — 3 over R. Determine whether u,v,w GV are 
linearly dependent or independent where: 

(i) u = fS - 4*2 + 2t + 3, V = ts + 2*2 + 4t - 1, w - 2fi - fi - 3t + 5 

(ii) u - t3 _ 5^2 _ 2t + 3, V - t^ -4t^ -3t + A, w = 2fi - 7fi - It + 9 



116 BASIS AND DIMENSION [CHAP. 5 

5.56. Let V be the vector space of functions from R into R. Show that f,g,h e V are linearly independ- 
ent where: (i) f{t) = e«, g(t) = sin t, h(t) = th (ii) /(*) = e«, g(t) = e^*, hit) = t; (iii) /(f) = e«, 
g{t) = sin t, h(t) = cos t. 

5.57. Show that: (i) the vectors (1 — i, i) and (2, —1 + i) in C^ are linearly dependent over the complex 
field C but are linearly independent over the real field R; (ii) the vectors (3 -I-V2, 1 + \/2) and 
(7, 1 + 2'\/2 ) in R2 are linearly dependent over the real field R but are linearly independent over the 
rational field Q. 

5.58. Suppose u, v and w are linearly independent vectors. Show that: 
(i) u + V — 2w, u — v — w and u + w are linearly independent; 
(ii) u + V — 3w, u + 3v — w and v + w are linearly dependent. 

5.59. Prove or show a counterexample: If the nonzero vectors u, v and w are linearly dependent, then w 
is a linear combination of u and v. 

5.60. Suppose Vi, v^, . . . , f „ are linearly independent vectors. Prove the following: 

(i) {oi'Wi, a2>'2< • • •> "n'^'n} is linearly independent where each eij # 0. 

(ii) {vi, . . .,v^-l,w,v^^l, . . ., v„} is linearly independent where w = b^v^ + • • • + b^Vi + • • • + b„v„ 
and 5i # 0. 

5.61. Let V = (a, 6) and w — (c, d) belong to K^. Show that {v, w} is linearly dependent if and only if 
ad— be = 0. 

5.62. Suppose {ui, . . . , m^, Wi, . . . , Wg} is a linearly independent subset of a vector space V. Show that 
L(Mi) n L{Wj) = {0}. (Recall that L(Mj) is the linear span, i.e. the space generated by the Mj.) 

5.63. Suppose (a^, ..., ai„), . . . , (ttmi, . . . , a^^) a^'e linearly independent vectors in Z", and suppose 
■Ui, . . . , v„ are linearly independent vectors in a vector space V over K. Show that the vectors 

Wi = au^'i + • • • + ffliB^n, . . ., Wm = a^i^i + • • • + «mn1'K 

are also linearly independent. 

BASIS AND DIMENSION 

5.64. Determine whether or not each of the following forms a basis of R^: 
(i) (1, 1) and (3, 1) (iii) (0, 1) and (0, -3) 

(ii) (2, 1), (1, -1) and (0, 2) (iv) (2, 1) and (-3, 87) 

5.65. Determine whether or not each of the following forms a basis of R^: 
(i) (1, 2, -1) and (0, 3, 1) 

(ii) (2, 4, -3), (0, 1, 1) and (0, 1, -1) 

(iii) (1, 5, -6), (2, 1, 8), (3, -1, 4) and (2, 1, 1) 

(iv) (1, 3, -4), (1, 4, -3) and (2, 3, -11) 

5.66. Find a basis and the dimension of the subspace W of R* generated by: 
(i) (1, 4, -1, 3), (2, 1, -3, -1) and (0, 2, 1, -5) 

(ii) (1, -4, -2, 1), (1, -3, -1, 2) and (3, -8, -2, 7) 

5.67. Let V be the space of 2 X 2 matrices over R and let W be the subspace generated by 

ui). (-: \y (-r:) - (4-;; 

Find a basis and the dimension of W. 

5.68. Let W be the space generated by the polynomials 

tt = (3 + 2*2 - 2t + 1, t) = t* + 3*2 - « + 4 and w = 2*3 + t2 - 7t - 7 

Find a basis and the dimension of W. 



CHAP. 5] BASIS AND DIMENSION 117 

5.69. Find a basis and the dimension of the solution space W of each homogeneous system: 

X + Sy + 2z = X - 2y + 7z - 

X + 4:y + 2z = 
X + 5y + z - 2x + 3y - 2z = 

2x+ y + 5z = 
3x + 5y + 8z = 2x - y + z = 

(i) (li) (iii) 

5.70. Find a basis and the dimension of the solution space W of each homogeneous system: 

X + 2y ~2z + 2s - t = x + 2y - z + 3s-4t = 

X + 2y - z + 3s - 2t = 2x + 4y - 2z - s + 5t = 

2x + 4y ~ Iz + s + t = 2x + iy - 2z + 4:S -2t - 

(i) (ii) 

5.71. Find a homogeneous system whose solution set W is generated by 

{(1, -2, 0, 3, -1), (2, -3, 2, 5, -3), (1, -2, 1, 2, -2)} 

5.72. Let V and W be the following subspaces of R*: 

V = {(a,b,c,d): 6-2c + d = 0}, W = {{a,b,c,d): a = d, b = 2c} 
Find a basis and the dimension of (i) V, (ii) W, (iii) VnW. 

5.73. Let V be the vector space of polynomials in t of degree — n. Determine whether or not each of the 
following is a basis of V: 

(i) {1, 1 + t, l + t+t2, l+t+t2 + t3, ..., l + t+t2+ ••• +t"-l + t«} 

(ii) {1 + t, t+ fi, t^ + t», ..., t"-2 + t"-i, t»-i + t"}. 

SUMS AND INTERSECTIONS 

5.74. Suppose V and W are 2-dimensional subspaces of R3. Show that VnW ^ {0}. 

5.75. Suppose U and W are subspaces of V and that dim U = 4, dim W = 5 and dim V - 7. Find the 
possible dimensions of U nW. 

5.76. Let U and P7 be subspaces of R3 for which dim U - 1, dim W = 2 and UlW. Show that 

Rs = r; © w'. 

5.77. Let U be the subspace of Rs generated by 

{(1, 3, -3, -1, -4), (1, 4, -1, -2, -2), (2, 9, 0, -5, -2)} 
and let W be the subspace generated by 

{(1, 6, 2, -2, 3), (2, 8, -1, -6, -5), (1, 3, -1, -5, -6)} 
Find (i) dim {U+W), (ii) dim (t/n VT). 

5.78. Let V be the vector space of polynomials over R. Let U and W be the subspaces generated by 

{t3 + 4*2 - t + 3, t3 + 5*2 + 5^ 3*3 + 10(2 - 5t + 5} and {t^ + U^ + &,t^ + 2t^-t + 5, 2*3 + 2*2 - 3* + 9} 
respectively. Find (i) dim (f/ + W'), (ii) i\m.(VnW). 

5.79. Let U be the subspace of RS generated by 

{(1, -1, -1, -2, 0), (1, -2, -2, 0, -3), (1, -1, -2, -2, 1)} 
and let W be the subspace generated by 

{(1, -2, -3, 0, -2), (1, -1, -3, 2, -4), (1, -1, -2, 2, -5)} 
(i) Find two homogeneous systems whose solution spaces are U and W, respectively, 
(ii) Find a basis and the dimension of U r\W. 



118 BASIS AND DIMENSION [CHAP. 5 

COORDINATE VECTORS 

5.80. Consider the following basis of B^: {(2, 1), (1, -1)}. Find the coordinate vector of vSU^ relative 
to the above basis where: (i) i; = (2,3); (ii) v = (4,-1), (iii) (3,-3); (iv) v = (a,b). 

5.81. In the vector space V of polynomials in t of degree - 3, consider the following basis: {1, 1 - t, (1 - t)^, 
(1 _ t)3}. Find the coordinate vector of v S y relative to the above basis if: (i) v = 2 - 3t + t* + 2t^; 
(ii) i; = 3 - 2t - ^2; (iii) v = a + bt + ct^ + dt^. 

5 82 In the vector space PF of 2 X 2 symmetric matrices over R, consider the following basis: 

{(-11). a :)•(-' 1)} 

Find the coordinate vector of the matrix AGW relative to the above basis if: 

... - = (4 -I) <"' ^<l I) 

5.83. Consider the following two bases of R*: 

{ei = (1, 1, 1), 02 = (0, 2, 3), 63 = (0, 2, -1)} and {/i = (1, 1, 0), /z = (1, -1. 0), fs = (0, 0, 1)} 
(i) Find the coordinate vector ot v = (3,5,-2) relative to each basis: [v]^ and [v]y. 
(ii) Find the matrix P whose rows are respectively the coordinate vectors of the e, relative to the 

basis {/i, /a, /a)- 
(iii) Verify that [v]eP=[v]f. 

5.84. Suppose {e^, . . .,e„} and {fu ••../„} are bases of a vector space V (of dimension n). Let P be the 
matrix whose rows are respectively the coordinate vectors of the e's relative to the basis {fih Prove 
that for any vector veV, [v]^P = [v]f. (This result is proved in Problem 5.39 in the case n - 3.) 

5.85. Show that the coordinate vector of S F relative to any basis of V is always the zero w-tuple 
(0,0, ...,0). 

RANK OF A MATRIX 

5.86. Find the rank of each matrix: 




5.87. Let A and B be arbitrary mXn matrices. Show that rank (A + B) ^ rank (A) + rank (B). 

5.88. Give examples of 2 X 2 matrices A and B such that: 

(i) rank (A + B)< rank (A), rank {B) (ii) rank (A + B) = rank (A) = rank (B) 

(iii) rank (A + B) > rank (A), rank (B) 

MISCELLANEOUS PROBLEMS 

5.89. Let W be the vector space of 3 X 3 symmetric matrices over K. Show that dimW = 6 by ex- 
hibiting a basis of W. (Recall that A = (ay) is symmetric iff Oy = a^;.) 

5.90. Let W be the vector space of 3 X 3 antisymmetric matrices over K. Show that dim II' = 3 by 
exhibiting a basis of W. (Recall that A = (a„) is antisymmetric iff «« - -a^j.) 

5.91. Suppose dim y = n. Show that a generating set with w elements is a basis. (Compare with Theorem 
5.6(iii), page 89). 



CHAP. 5] BASIS AND DIMENSION 119 

5.92. Let tj, «2, • ■ • . <B be symbols, and let K be any field. Let V be the set of expressions ajti + 03*2 + 
• • • + a„t„ where aj G K. Define addition in V by 

(ajfj + 02*2 + • • • + a„tj + (biti + 62*2 + • • • + 6„«„) 

= (Oi + 6i)ti + (02 + 62)«2 + • • • + K + «>„)«« 
Define scalar multiplication on V by 

fc(«i*i + «2*2 + • • • + o„<„) = ka^ti + ka^t^ + • • • + fea„t„ 

Show that y is a vector space over K with the above operations. Also show that {t^, . . . , t„} is a 
basis of y where, for t = 1, . . . , n, 

ti = Oti + • • • + Oti_i + l«i + Ofj + i + ■ • • + ot„ 

5.93. Let y be a vector space of dimension n over a field K, and let K he a, vector space of dimension m 
over a subfield F. (Hence V may also be viewed as a vector space over the subfield F.) Prove that 
the dimension of V over F is ww. 

5.94. Let U and W be vector spaces over the same field K, and let V be the external direct sum of U and 
TF (see Problem 4.45). Let t7 and Jy be the subspaces of V defined by C7 = {(m, 0) : m G 17} and 
W = {(0, w): w G W). 

(i) Show that U is isomorphic to U under the correspondence u <r> (m, 0), and that W is iso- 
morphic to W under the correspondence w ■*-> (0, w). 

(ii) Show that dim V = dim V + dim W. 

5.95. Suppose V = U ® W. Let V be the external direct product of U and W. Show that V is isomorphic 
to y under the correspondence v = u + w <r^ {u,w). 



Answers to Supplementary Problems 

5.52. (i) no, (ii) yes, (iii) yes, (iv) no, (v) yes, (vi) no, (vii) yes, (viii) no. 

5.53. (i) dependent, (ii) independent. 

5.54. (i) dependent, (ii) independent. 

5.55. (i) independent, (ii) dependent. 

5.57. (i) (2,-l + t) = (l + i)(l-t,i); (ii) (7, l + 2\/2) = (3-v^)(3 + -\/2, 1 + -/2). 

5.59. The statement is false. Counterexample: u = (1, 0), v — (2, 0) and w = (1, 1) in R*. Lemma 5.2 
requires that one of the nonzero vectors u,v,w is a linear combination of the preceding ones. In 
this case, v = 2m. 

5.64. (i) yes, (ii) no, (iii) no, (iv) yes. 

5.65. (i) no, (ii) yes, (iii) no, (iv) no. 

5.66. (i) dim W = 3, (ii) dim W = 2. 

5.67. dim W = 2. 

5.68. dim Ty = 2. 

5.69. (i) basis, {(7, -1, -2)}; dim ly = 1. (ii) dim ly = 0. (iii) basis, {(18, -1, -7)}; dim W = 1. 

5.70. (i) basis, {(2,-1,0,0,0), (4,0,1,-1,0), (3,0,1,0,1)}; dim Ty = 3. 
(ii) basis, {(2, -1, 0, 0, 0), (1, 0, 1, 0, 0)}; dim ly = 2. 



120 



BASIS AND DIMENSION 



[CHAP. 5 



5.71. 



5.72. 



{ 



5x + y — z — s = 
X + y — z — t = 



5.73. 
5.75. 
5.77. 
5.78. 



5.79. (i) 



(i) basis, {(1, 0, 0, 0), (0, 2, 1, 0), (0, -1, 0, 1)}; dim V = 3. 

(ii) basis, {(1,0,0,1), (0,2,1,0)}; dim Tf = 2. 

(iii) basis, {(0,2,1,0)}; dim (VnH^) = 1. Hint. VnW must satisfy all three conditions on a,b,c 
and d. 

(i) yes, (ii) no. For dim V = n + 1, but the set contains only n elements. 
dimd/nW') = 2, 3 or 4. 
dim {U+W) = 3, dim (UnW)^ 2. 
dim (U+W) = 3, dim (t/n W) = 1. 
- t = 



[30: + 
\4x + 



— z 
2y + s =0 

(ii) {(1, -2, -5, 0, 0), (0, 0, 1, 0, -1)}. dim {VnW) = 2 



\Ax + 2y 
X^x + 2j/ + 



-8 = 

2 + t = 



5.80. (i) M = (5/3, -4/3), (ii) [v] = (1, 2), (iii) M = (0, 3), (iv) [v] = ((a + 6)/3, (a-26)/3). 

5.81. (i) M = (2, -5, 7, -2), (ii) [v] = (0, 4, -1, 0), (iii) [v] = (a + h + c + d, -b-2c-3d, c + 3d, -d). 

5.82. (i) [A] = (2, -1, 1), (ii) [A] = (3, 1, -2). 

/I 1\ 

5.83. (i) [v], = (3, -1, 2), M; = (4, -1, -2); (ii) P = 1 -1 3 . 



5.86. (i) 3, (ii) 2, (iii) 3, (iv) 2. 



5.88. (i) A 
(ii) A 



(J P' 

(o o)' 



B 



CI -I) 

c :) 



fl o\ /o 1 o\ /o l\ 
5.89. <;io 0,1 0,0 
\0 0/ \o 0/ \l 0/ 



<""^=a i> "h: °) 



0\ /O 0' 
1 0,0 1 |, 
,0 0/ \0 1 0; 




5.90. 




l\ /O 0^ 
0,0 1 

-1 0/ \o -1 o; 



5.93. Hint. The proof is identical to that given in Problem 5.48, page 113, for a special case (when V is 
an extension field of K). 



chapter 6 



Linear Mappings 



MAPPINGS 

Let A and 5 be arbitrary sets. Suppose to each aGA there is assigned a unique ele- 
ment of B; the collection, /, of such assignments is called a function or mapping (or- map) 
from A into B, and is written 

f:A-^B or A^B 

We write f{a), read "/ of a", for the element of B that / assigns to a e A; it is called the 
value of fata or the image of a under /. If A' is any subset of A, then /(A') denotes the set 
of images of elements of A'; and if B' is any subset of B, then f-'{B') denotes the set of 
elements of A each of whose image lies in B': 

f{A') = {/(a) : aGA'} and f'W) = {aGA: f{a) G B'} 

We call f{A') the imxtge of A' and /-^(jB') the inverse image or preimage of B'. In particular, 
the set of all images, i.e. f{A), is called the image (or: ranflre) of /. Furthermore, A is called 
the domain of the mapping f:A^B, and B is called its co-domain. 

To each mapping f:A^B there corresponds the subset of A x B given by 
{{a, f{a)) : aGA}. We call this set the graph of /. Two mappings f:A-*B and g:A-^B 
are defined to be equal, written f = g, if /(a) = 5r(a) for every aGA, that is, if they have 
the same graph. Thus we do not distinguish between a function and its graph. The nega- 
tion of f = g is written f ¥- g and is the statement: there exists an aGA for which 
f(a) ^ g{a). 



Example 6.1 : Let A = {a, b, e, d} and B 
f from A into B: 



{x, y, z, w}. The following diagram defines a mapping 




Here f(a) = y, /(6) = x, f{c) = z, and f(d) = y. Also, 

faa,b,d}) = {f (a), fib), fid)} = {y,x,y} = {x,y} 
The image (or: range) of / is the set [x,y,z}: f(A) = {x,y,z}. 

Example 6.2: Let /:R -» R be the mapping which assigns to each real number x its square a;^: 

X V^ x^ or f{x) = a;2 
Here the image of —3 is 9 so we may write /(— 3) = 9. 



121 



122 



LINEAR MAPPINGS 



[CHAP. 6 



We use the arrow i^ to denote the image of an arbitrary element x eA under a mapping 
/:A->5 bywriting x ^ fix) 

as illustrated in the preceding example. 



Example 6.3: Consider the 2 X 3 matrix A = 



c 



If we write the vectors in R^ and 



'1 -3 5' 
,2 4 -1, 
R2 as column vectors, then A determines the mappmg T : RS -> R2 defined by 

V \-* Av, that is, T{v) - Av, v& R3 

3\ /I -3 5\/ ^^ 

1 , then T{v) = Av = ( 2 4 -_i | ^ 

-2/ ^ \-2/ 



Thus if V = 



< 



-10 
12 



Remark: 



Every mxn matrix A over a field K determines the mapping T : X" -» K™ 

defined by v v^ Av 

where the vectors in if" and li:"' are written as column vectors. For convenience 

we shall usually denote the above mapping by A, the same symbol used for the 

matrix. 



Example 6.4: 



Example 6.5: 



Example 6.6: 



Let V be the vector space of polynomials in the variable t over the real field R. 
Then the derivative defines a mapping D:V-^V where, for any polynomial / G V, 
we let D(f) = df/dt. For example, D(3t^ - 5t + 2) = 6t - 5. 

Let V be the vector space of polynomials in t over R (as in the preceding example). 
Then the integral from, say, to 1 defines a mapping ^ : V -* R where, for any 

polynomial f&V, we let ^(/) = f f(t) dt. For example, 

^(3(2-5* + 2) = r (3t^-5t + 2)dt = i 

Note that this map is from the vector space V into the scalar field R, whereas the 
map in the preceding example is from V into itself. 

Consider two mappings f:A^B and g.B^C illustrated below: 



0. 



-©— ^-© 



Let a G A; then /(a) G B, the domain of g. Hence we can obtain the image of /(a) 
under the mapping g, that is, g(f{a))- This map 

a K g{f(a)) 

from A into C is called the composition or product of / and g, and is denoted by 
g°f. In other words, (g°f):A-^C is the mapping defined by 

{9°f){a) = g(f(o)) 

Our first theorem tells us that composition of mappings satisfies the associative law. 
Theorem 6.1: Let f.A^B, g.B^C and h.C^D. Then ho{gof) = (hog)of. 
We prove this theorem now. If a G A, then 

{ho{gof)){a) = h{igof){a)) = h{g{f{a))) 

and ({h°g)of){a) = {hog){f{a)) = Hg{f{a))) 

Thus iho{gof)){a) = {ihog)of){a) for every a G A, and so ho{gof) = {hog)of. 

Remark: Let F:A-^B. Some texts write aF instead of F{(i) for the image of a G A 
under F. With this notation, the composition of functions F:A-*B and 
G-.B^C is denoted by F o G and not by G o F as used in this text. 



CHAP. 6] 



LINEAR MAPPINGS 



123 



We next introduce some special types of mappings. 



Definition: 



A mapping f:A-*B is said to be one-to-one (or one-one or 1-1) or injective 
if different elements of A have distinct images; that is, 



or, equivalently. 



if a v^ a' implies /(a) ¥- f{a') 
if /(a) = f{a') implies a = a' 



Definition: 



A mapping f:A-^B is said to be onto (or: / maps A onto B) or surjective if 
every b G B is the image of at least one a G A. 



A mapping which is both one-one and onto is said to be bijective. 

Example 6.7: Let /:R^R, g:B,-*R and fc : R -» B be defined by f(x) - 2'', g{x) — x^ - x and 
h(x) = x^. The graphs of these mappings follow: 




f(x) = 2=^ 




g(x) = X' 



h(x) = a;2 



The mapping / is one-one; geometrically, this means that each horizontal line does 
not contain more than one point of /. The mapping g is onto; geometrically, this 
means that each horizontal line contains at least one point of g. The mapping h 
is neither one-one nor onto; for example, 2 and —2 have the same image 4, and —16 
is not the image of any element of R. 

Example 6.8: Let A be any set. The mapping /:A-»A defined by f{a) = a, i.e. which assigns 
to each element in A itself, is called the identity mapping on A and is denoted by 
1^ or 1 or /. 

Example 6.9: Let f:A-*B. We call g-.B^A the inverse of /, written /~i, if 

f°g = Ifl and g°f = 1a 

We emphasize that / has an inverse if and only if / is both one-to-one and onto 
(Problem 6.9). Also, if 6GB then /"'(ft) = a where a is the unique element of A 
for which f(a) = 6. 



LINEAR MAPPINGS 

Let V and U be vector spaces over the same field K. A mapping F:V -* U is called a 
linear mapping (or linear transformation or vector space komomorphism) if it satisfies the 
following two conditions: 

(1) For any v,wGV, F{v + w) = F{v) + F{w). 

(2) For any kGK and any vGV, F{kv) = kF{v). 

In other words, F:V^ U is linear if it "preserves" the two basic operations of a vector 
space, that of vector addition and that of scalar multiplication. 

Substituting k = into (2) we obtain F{0) = 0. That is, every linear mapping takes 
the zero vector into the zero vector. 



124 



LINEAR MAPPINGS [CHAP. 6 



Now for any scalars a,b gK and any vectors v,w eV we obtain, by applying both 
conditions of linearity, 

F{av + bw) = F{av) + F{bw) = aF{v) + bF{w) 

More generally, for any scalars aiGK and any vectors viGV we obtain the basic 
property of linear mappings: 

F{aiVi + aiVi + • • • + anV„) = aiF{vi) + a^FiVi) + • • • + anF(Vn) 
We remark that the condition F{av + hw) = aF{v) + bF{w) completely characterizes 
linear mappings and is sometimes used as its definition. 

Example 6.10: Let A be any m X % matrix over a field K. As noted previously, A determines a 
mapping T:&^K^ by the assignment v y-> Av. (Here the vectors in K" and K"" 
are written as columns.) We claim that T is linear. For, by properties of matrices, 

T(v + w) - A(v + w) = Av + Aw = T{v) + T{w) 
and T(kv) = A(kv) = kAv = kT{v) 

where v,w G K^ and kE. K. 
We comment that the above type of linear mapping shall occur again and again. In 
fact, in the next chapter we show that every linear mapping from one finite-dimensional 
vector space into another can be represented as a linear mapping of the above type. 

Example 611- Let i^ : R^ -^ B» be the "projection" mapping into the xy plane: F(x, y, z) = {x, y, 0). 
We show that F is linear. Let v = (a, b, c) and w = {a', b', c'). Then 
F{v + w) = F{a + a',b + 6', c + c') = {a + a', b + 6', 0) 
= (o, b, 0) + (a', b', 0) = F(v) + F(w) 
and, for any fc € R, 

F(kv) = F{,ka, kb, ke) = (ka, kb, 0) = k(a, b, 0) = kF(v) 

That is, F is linear. 

Example 612- Let F : R2 ^ R2 be the "translation" mapping defined by F(x,y) = (x + l,y + 2) 
Observe that F(0) = F(0,0) = (1,2) ^ 0. That is, the zero vector is not mapped 
onto the zero vector. Hence F is not linear. 

Example 6.13: Let F.V^U be the mapping which assigns G [7 to every veV. Then, for 
any v,weV and any kG K, we have 

F{v + w) = = + = F(v) + F(w) and F(kv) = = fcO = kF(v) 
Thus F is linear. We call F the zero mapping and shall usually denote it by 0. 

Example 6.14: Consider the identity mapping I.V^V which maps each vGV into itself . Then, 
for any v,wGV and any a,beK, we have 

/(av + bw) = av + bw = al{v) + bl(w) 
Thus / is linear. 

Example 6.15: Let V be the vector space of polynomials in the variable t over the real field R 
Then the differential mapping D:V-^V and the mtegral mappmg JtV^B 
defined in Examples 6.4 and 6.5 are linear. For it is proven in calculus that for any 
u,v SV and fe e R, 

d{u + v) du , dv djku) j^du 

dt- = dt+dl ^""^ dt "dt 

that is, D(u + v)=D(u) + D(v) and D(ku) = k D(u); and also, 

f {u(t) + v(t)) dt = \ u(t) dt + \ v(t) dt 

and f ku{t)dt = k \ u{t) dt 

that is, J{u + •») = SM + SM and ^(few) = k^iu). 



CHAP. 6] LINEAR MAPPINGS 125 

Example 6.16: Let F:V ^ U be a linear mapping which is both one-one and onto. Then an inverse 
mapping F"! : C/ -> y exists. We will show (Problem 6.17) that this inverse map- 
ping is also linear. 

When we investigated the coordinates of a vector relative to a basis, we also introduced 
the notion of two spaces being isomorphic. We now give a formal definition. 

Definition: A linear mapping F:V^U is called an isomorphism if it is one-to-one. The 
vector spaces V, U are said to be isomorphic if there is an isomorphism of 
V onto U. 

Example 6.17: Let y be a vector space over K of dimension n and let {e^, . . .,e„} be a basis of V. 
Then as noted previously the mapping v K [v]^, i.e. which maps each v eV into 
its coordinate vector relative to the basis {e}, is an isomorphism of V onto K". 

Our next theorem gives us an abundance of examples of linear mappings; in particular, 
it tells us that a linear mapping is completely determined by its values on the elements 
of a basis. 

Theorem 6.2: Let V and U be vector spaces over a field K. Let {vi,'i;2, . . .,Vn} be a basis 
of V and let Ui, Ui, . . .,Un be any vectors in V. Then there exists a unique 
linear mapping F:V-^U such that F{vi) = Ui, F{v2) = 112, ..., F{vn) = th. 

We emphasize that the vectors Mi, . . . , zt„ in the preceding theorem are completely ar- 
bitrary; they may be linearly dependent or they may even be equal to each other. 



KERNEL AND IMAGE OF A LINEAR MAPPING 

We begin by defining two concepts. 

Definition: Let F:V-^U be a linear mapping. The image of F, written Im F, is the set 
of image points in U: 

Im F = {uGU : F{v) = u for some v G V} 

The kernel of F, written KerF, is the set of elements in V which map into 
OGU: 

Ker F = {vGV: F{v) = 0} 

The following theorem is easily proven (Problem 6.22). 

Theorem 6.3: Let F:V->U be a linear mapping. Then the image of /^ is a subspace 
of U and the kernel of /?" is a subspace of V. 

Example 6.18: Let F:R3-^H^ be the projection map- 
ping into the xy plane: F(x, y, z) — 
(x, y, 0). Clearly the image of F is the 
entire xy plane: | l || l ||llill • ■» = (a, 6, c) 

Im F = {(a, 6, 0) : o, b G R} 

Note that the kernel of F is the z axis: 

KerF = {(0, 0, c): c G R} 

since these points and only these points 
map into the zero vector = (0, 0, 0). 




128 



LINEAR MAPPINGS [CHAP. 6 



Theorem 5.11: The dimension of the solution space W of the homogeneous system of linear 
equations AX = is n-r where n is the number of unknowns and r is the 
rank of the coefficient matrix A. 



OPERATIONS WITH LINEAR MAPPINGS 

We are able to combine linear mappings in various ways to obtain new linear mappings. 
These operations are very important and shall be used throughout the text. 

Suppose F:V-*U and G:V-^U are linear mappings of vector spaces over a field K. 
We define the sum F + G to he the mapping from V into U which assigns F(v) + G{v) to 

^^^' (F + G){v) = F{v) + Giv) 

Furthermore, for any scalar kGK, we define the product kF to be the mapping from V 
into U which assigns k F{v) to i; e F: 

ikF)iv) = kF{v) 

We show that if F and G are linear, then i^^ + G and kF are also linear. We have, for any 
vectors v,w GV and any scalars a,h GK, 

{F + G){av + bw) = F{av + bw) + Giav + bw) 

= aF{v) + bF{w) + aG(v) + bG{w) 
= a{Fiv) + G{v)) + b(F{w) + G{w)) 
= a(F + G){v) + b{F + G){w) 

and {kF)(av + bw) = kF{av + bw) = k{aF{v) + bF{w)) 

= akF{v) + bkF(w) = a(kF)(v) + b(kF){w) 

Thus F + G and kF are linear. 

The following theorem applies. 

Theorem 6.6: Let V and U be vector spaces over a field K. Then the collection of all 
linear mappings from V into U with the above operations of addition and 
scalar multiplication form a vector space over K. 

The space in the above theorem is usually denoted by 

Hom(7, [/) 

Here Hom comes from the word homomorphism. In the case that V and U are of finite 
dimension, we have the following theorem. 

Theorem 6.7: Suppose dim 7 = m and dim U ^ n. Then dim Hom(V, U) = mn. 

Now suppose that V, U and W are vector spaces over the same field K, and that F:V-*U 
and G:U^W are linear mappings: 

Recall that the composition function Goi?' is the mapping from V into W defined by 
{GoF){v) = G{Fiv)). We show that GoF is linear whenever F and G are linear. We have, 
for any vectors v,w gV and any scalars a,b GK, 

{GoF)iav + bw) = G{Fiav + bw)) = G{aF{v) + bFiw)) 

= aG{F{v)) + bGiF{w)) = aiGoF){v) + b{GoF)(w) 
That is, G o F is linear. 



CHAP. 6] LINEAR MAPPINGS 



129 



The composition of linear mappings and that of addition and scalar multiplication are 
related as follows: 

Theorem 6.8: Let V, U and W be vector spaces over K. Let F, F' be linear mappings from 
V into U and G, G' linear mappings from U into W, and let k&K. Then: 

(i) Go(F + F') = GoF + GoF' 

(ii) (G + G')oF = GoF + G'oF 

(iii) k{GoF) = {kG)oF = Go(kF). 

ALGEBRA OF LINEAR OPERATORS 

Let F be a vector space over a field K. We novir consider the special case of linear map- 
pings T:V^V, i.e. from V into itself. They are also called linear operators or linear 
transformations on V. We will write AiV), instead of Horn (V, V), for the space of all such 
mappings. 

By Theorem 6.6, A{V) is a vector space over K; it is of dimension n^ if V is of dimension 
n. Now if T,SgA{V), then the composition SoT exists and is also a linear mapping 
from V into itself, i.e. SoTgA(V). Thus we have a "multiplication" defined in A{V). 
(We shall write ST for SoT in the space A{V).) 

We remark that an algebra A over a field X is a vector space over K in which an opera- 
tion of multiplication is defined satisfying, for every F.G,H GA and every kGK, 

(i) F{G + H) = FG + FH 

(ii) (G + H)F = GF + HF 

(iii) k{GF) = {kG)F = G(kF). 

If the associative law also holds for the multiplication, i.e. if for every F,G,H gA, 
(iv) {FG)H = F{GH) 

then the algebra A is said to be associative. Thus by Theorems 6.8 and 6.1, A{V) is an 
associative algebra over K with respect to composition of mappings; hence it is frequently 
called the algebra of linear operators on V. 

Observe that the identity mapping / : 7 -> F belongs to A{V). Also, for any T G A{V), 
we have TI - IT - T. We note that we can also form "powers" of T; we use the notation 
T^^ToT,T^ = ToToT, .... Furthermore, for any polynomial 

V(x) = tto + aix + a2X^ + • • ■ + a„x", aiGK 

we can form the operator p{T) defined by 

p{T) = aol + aiT + a^T^ + • • ■ + a„r" 

(For a scalar kGK, the operator kl is frequently denoted by simply k.) In particular, if 
V{T) = 0, the zero mapping, then T is said to be a zero of the polynomial p{x). 

Example 6.21: Let T : R3 ^ R3 be defined by T(x,y,z) = (0,x,y). Now if {a,b,c) is any element 
of R3, then: 

{T + I)(a, b, c) = (0, a, b) + (a, b, c) = (a,a+b,b + c) 

and T^(a, b, c) = T^0, a, b) = T{0, 0, a) = (0, 0, 0) 

Thus we see that rs = 0, the zero mapping from V into itself. In other words, 
T is a zero of the polynomial p{x) = v?. 



130 LINEAR MAPPINGS [CHAP. 6 

INVERTIBLE OPERATORS 

A linear operator T -.V^ V is said to be invertible if it has an inverse, i.e. if there 
exists r-i e A{V) such that TT-^ = T-^T = I. 

Now T is invertible if and only if it is one-one and onto. Thus in particular, if T is 
invertible then only gV can map into itself, i.e. T is nonsingular. On the other hand, 
suppose T is nonsingular, i.e. Ker T = {0}. Recall (page 127) that T is also one-one. More- 
over, assuming V has finite dimension, we have, by Theorem 6.4, 

dimF = dim(Imr) + dim (Ker T) = dim(Imr) + dim({0}) 

= dim (Im T) + = dim (Im T) 

Then ImT -V, i.e. the image of T is V; thus T is onto. Hence T is both one-one and onto 
and so is invertible. We have just proven 

Theorem 6.9: A linear operator T:V-*V on a vector space of finite dimension is in- 
vertible if and only if it is nonsingular. 

Example 6.22: Let T be the operator on R2 defined by T(x, y) = (y, 2x-y). The kernel of T is 
{(0, 0)}; hence T is nonsingular and, by the preceding theorem, invertible. We now 
find a formula for T-i. Suppose (s, t) is the image of {x, y) under T\ hence (a;, y) 
is the image of (s, «) under r-i; T(x,y) = (s,t) and T-'^(s, t) = (x, y). We have 

T(x, y) - (y, 2x — y) = (s, t) and so y = s, 2x-y = t 

Solving for x and y in terms of s and t, we obtain a; = is + li, 2/ = s. Thus T^' 
is given by the formula T~^(s, t) = (|s + it, s). 

The finiteness of the dimensionality of V in the preceding theorem is necessary as seen 
in the next example. 

Example 6.23: Let V be the vector space of polynomials over K, and let T be the operator on V 

defined by 

T(ao + ait-\ + ajn) = a^t + Uit^ + ■ • • + a„<» + » 

i.e. T increases the exponent of t in each term by 1. Now T is a linear mapping 
and is nonsingular. However, T is not onto and so is not invertible. 

We now give an important application of the above theorem to systems of linear 
equations over K. Consider a system with the same number of equations as unknowns, 
say n. We can represent this system by the matrix equation 

Ax = b (*) 

where A is an n-square matrix over K which we view as a linear operator on K". Suppose 
the matrix A is nonsingular, i.e. the matrix equation Ax = has only the zero solution. 
Then, by Theorem 6.9, the linear mapping A is one-to-one and onto. This means that the 
system (*) has a unique solution for any b G K". On the other hand, suppose the matrix 
A is singular, i.e. the matrix equation Ax = has a nonzero solution. Then the linear 
mapping A is not onto. This means that there exist b G K" for which (*) does not have a 
solution. Furthermore, if a solution exists it is not unique. Thus we have proven the 
following fundamental result: 

Theorem 6.10: Consider the following system of linear equations: 

anXi + ai2X2 4- • • • + amajn = bi 

a2lXl + a22X2 + • • • + a2nXn = b2 



a„lXl + an2X2 + • • ■ + annXn = &« 



CHAP. 6] 



LINEAR MAPPINGS 



131 



(i) If the corresponding homogeneous system has only the zero solution, 
then the above system has a unique solution for any values of the bu 

(ii) If the corresponding homogeneous system has a nonzero solution, then: 
(i) there are values for the 6i for which the above system does not have 
a solution; (ii) whenever a solution of the above system exists, it is 
not unique. 



Solved Problems 



MAPPINGS 

6.1. State whether or not each diagram defines a mapping from A = {a, b, c) into 
B= {x,y,z}. 





(i) (ii) 

(i) No. There is nothing assigned to the element 6 G A. 
(ii) No. Two elements, x and z, are assigned to c G A. 
(iii) Yes. 




v(iii) 



6.2. Use a formula to define each of the following functions from R into R. 

(i) To each number let / assign its cube. 

(ii) To each number let g assign the number 5. 

(iii) To each positive number let h assign its square, and to each nonpositive number 
let h assign the number 6. 

Also, find the value of each function at 4, —2 and 0. 

(i) Since / assigns to any number z its cube a^, we can define / by /(») = a:^. Also: 
/(4) = 43 = 64, /(-2) = (-2)3 = -8, /(O) = 0^ = 

(ii) Since g assigns 5 to any number x, we can define g by g{x) = 5. Thus the value of g at each 
number 4, —2 and is 5: 

flr(4) = 5, fl-(-2) - 5, ff(0) = 5 



(iii) Two different rules are used to define h as follows: 

'a;2 if X > 
6 if a; =s 
Since 4 > 0, h(4) = 42 = 16. On the other hand, -2, ^ and so ^-2) = 6, fe(0) = 6. 



h{x) = 



132 



LINEAR MAPPINGS 



[CHAP. 6 



6^. Let A = {1,2,3,4,5} and let f:A-^A be the map- 
ping defined by the diagram on the right, (i) Find 
the image of /. (ii) Find the graph of /. 

(i) The image /(A) of the mapping / consists of all the points 
assigned to elements of A. Now only 2, 3 and 5 appear as 
the image of any elements of A; hence /(A) = {2,3,5}. 

(ii) The graph of / consists of the ordered pairs (a, /(a)), 
where a&A. Now /(I) = 3, /(2) = 5, /(3) = 5, /(4) = 2, 
/(5) = 3; hence the graph of 

/ = {(1,3),(2,5), (3,5), (4,2),(5,3)} 




6.4. Sketch the graph of (i) f{x) ^x^ + x-6, (ii) g{x) = x^-^x^-x + Z. 

Note that these are "polynomial functions". In each case set up a table of values for x and 
then find the corresponding values of f{x). Plot the points in a coordinate diagram and then draw 
a smooth continuous curve through the points. 



(i) 



X 


m 


-4 


6 


-3 





-2 


-4 


-1 


-6 





-6 


1 


-4 


2 





3 


6 




(ii) 



X 


9{x) 


-2 


-15 


-1 








3 


1 





2 


-3 


3 





4 


15 




6.5. Let the mappings f.A^B and g.B^C be defined by the diagram 
A f B g C 




(i) Find the composition mapping {gof):A-*C. (ii) Find the image of each map- 
ping: f,g and go f. 

(i) We use the definition of the composition mapping to compute: 

(gofXa) = g(f(a)) = g(y) = t 
ig°f)ib) = gim) = g(x) = s 
(9° me) = g(f(c)) = g{y) = t 
Observe that we arrive at the same answer if we "follow the arrows" in the diagram: 

a -* y -* t, b -* X -* s, e ^ y -* t 



CHAP. 6] LINEAR MAPPINGS I33 

(ii) By the diagram, the image values under the mapping / are x and y, and the image values under 
g are r, s and t\ hence 

image of / = {x, y} and image oi g = {r, s, t} 
By (i), the image values under the composition mapping gof are t and s; hence image of 
0°f = {s, *}• Note that the images of g and g°f are different. 

6.6. Let the mappings / and g be defined by f{x) = 2x + 1 and g{x) = x^-2. (i) Find 
(sro/)(4) and (/osr)(4). (ii) Find formulas defining the composition mappings gof 
and fog. 

(i) /(4) = 2^4 + 1 = 9. Hence (ff °/)(4) = p(/(4)) = ^(9) = 92 - 2 = 79. 

g(A) = 42 - 2 = 14. Hence (/ofl')(4) = /(fir(4)) = /<14) = 2 • 14 + 1 = 29. 

(ii) Compute the formula for g o f as follows: 

(ff°f){x) = g(f{x)) = fif(2a; + l) = (2a; + 1)2 - 2 = 4a;2 + 4a; - 1 

Observe that the same answer can be found by writing y = f(x) = 2a! + 1 and z = g(y) = 
j/2 - 2, and then eliminating y: z - y^-2 = {2x-¥ 1)^ -2 - 4x^ + 4a! - 1. 

(f°g)(«) ^ f{g(x)) = /(a;2-2) = 2(a;2-2) + 1 = 2a!2 - 3. Observe that fog^g°f. 

6.7. Let the mappings f:A-^B,g:B-^C and h:C-*D be defined by the diagram 

A f B g c h D 



a 




Determine if each mapping (i) is one-one, (ii) is onto, (iii) has an inverse. 

(i) The mapping / : A -> B is one-one since each element of A has a different image. The mapping 
g.B->C is not one-one since x and z both map into the same element 4. The mapping h:C -* D 
is one-one. 

(ii) The mapping f -.A^B is not onto since « e B is not the image of any element of A. The 
mapping g:B-*C is onto since each element of C is the image of some element of B. The 
mapping h-.C^D is also onto. 

(iii) A mapping has an inverse if and only if it is both one-one and onto. Hence only h has an 
inverse. 

6.8. Suppose f:A-*B and g.B^C; hence the composition mapping {gof):A^C exists. 
Prove the following, (i) If / and g are one-one, then gof is one-one. (ii) If / and g 
are onto, then gof is, onto, (iii) If gof is, one-one, then / is one-one. (iv) If gof is 
onto, then g is onto. 

(i) Suppose (g°f)(x) = (g°f){y). Then g(f{x)) = g{f(y)). Since g is one-one, f{x) = f(y). Since / 
is one-one, x = y. We have proven that {g ° f){x) = (g ° f){y) implies x = y\ hence flro/ is 
one-one. 

(ii) Suppose c&C. Since g is onto, there exists h e. B for which g(h) = c. Since / is onto, there 
exists ae.A for which f(a) - b. Thus (g '' f){a) = g(f(a)) = g(b) = c; hence flr o / is onto. 

(iii) Suppose / is not one-one. Then there exists distinct elements x,y G A for which /(a;) = f{y). 
Thus {g°f)(x) = g{f{x)) = g(f(y)) = (g ° f)(y); hence g°f is not one-one. Accordingly it g°f 
is one-one, then / must be one-one. 

(iv) If aGA, then (g ° f){a) = g(f{a)) G g{B); hence (g° f){A) C g(B). Suppose g is not onto. 
Then g(B) is properly contained in C and so (g ° f)(A) is properly contained in C; thus g°f is 
not onto. Accordingly Hgofia onto, then g must be onto. 



134 LINEAR MAPPINGS [CHAP. 6 

6.9. Prove that a mapping f:A-*B has an inverse if and only if it is one-to-one and onto. 

Suppose / has an inverse, i.e. there exists a function /-i : B -^ A for which f~^°f = 1a and 
/0/-1 = ig. Since 1a is one-to-one, / is one-to-one by Problem 6.8(iii); and since 1b is onto, / is 
onto by Problem 6.8(iv). That is, / is both one-to-one and onto. 

Now suppose / is both one-to-one and onto. Then each 6 S B is the image of a unique element 

in A, say 6. Thus if f{a) = b, then a = b; hence /(6) = b. Now let g denote the mapping from 

A. 
B to A defined by 6 1-^6. We have: 

(i) («? ° /)(<») - fir(/(o)) = S(b) = 6 = a, for every a G A; hence s ° / = 1a- 
(ii) {f°ff)(b) = f{g{b)) - fib) = 6, for every 6 G B; hence f°g -Ib- 
Accordingly, / has an inverse. Its inverse is the mapping g. 

6.10. Let / : R ^ R be defined by fix) = 2x-S. Now / is one-to-one and onto; hence / 
has an inverse mapping /"^ Find a formula for f~^. 

Let y be the image of x under the mapping f: y - f(x) =2x-S. Consequently x will be the 
image of y under the inverse mapping f-K Thus solve for x in terms of y in the above equation: 
X = {y + 3)/2. Then the formula defining the inverse function is f-Hy) = {y + 3)/2. 

LINEAR MAPPINGS 

6.11. Show that the following mappings F are linear: 
(i) F : R2 ^ R2 defined by F{x, y)=^{x + y, x). 

(ii) F:R^-*R defined by F{x,y,z) = 2x-Sy + 4z. 

(i) Let v = (a,b) and w = (a',b'); hence 

V + w - (a + a',b + b') and kv = {ka, kb), k&R 

We have F(v) = (a -t- 6, a) and F(w) = (a' + b', a'). Thus 

F(v + w) = F(a + a',h + b') - (a.-\- a.' + h+b' , a+ a') 
= (a + 6, a) + (a' + b', a') = F{v) + F(w) 
and F(kv) - F(ka, kb) := (ka + kb, ka) = k(a + b,a) = kF{v) 

Since v, w and A; were arbitrary, F is linear. 

(ii) Let v-(a,b,c) and w = (a',b',c'); hence 

V + w = (a + a',b + b',c + e') and kv - (ka, kb, kc), fc e E 

We have F(v) = 2a - 36 -I- 4c and F(w) = 2a' - 36' -t- 4c'. Thus 

F(v + w) = F(a + a',b + b',c + c') = 2(a -1- a') - 3(6 + 6') -h 4(c -I- c') 
= (2a - 36 + 4c) + (2a' - 36' 4- 4c') = F(v) + F(w) 
and F(kv) = F(ka, kb, kc) = 2ka - 3kb + 4kc = k(2a - 36 + 4c) = kF(v) 

Accordingly, F is linear. 

6.12. Show that the following mappings F are not linear: 
(i) F : R2 ^ R defined by F(x, y) = xy. 

(ii) FrB?^B? defined by F{x, y) = {x + 1, 2y, x + y). 
(iii) F:W-^B? defined by F{x, y, z) = (\x\, 0). 

(i) Let ■w = (l,2) and w = (3,4); then v + w = (A,6). 

We have F(v) = 1*2 = 2 and F(w) = 3 • 4 = 12. Hence 



CHAP. 6] LINEAR MAPPINGS 135 

F(v + w) = F(4, 6) = 4 • 6 = 24 ^ F{v) + F{w) 
Accordingly, F is not linear. 

(ii) Since F{0, 0) = (1, 0, 0) ^ (0, 0, 0), F cannot be linear. 

(iii) Let v = (1, 2, 3) and k = -3; hence kv = (-3, -6, -9). 

We have F{v) = {1,0) and so kF (v) = -S(1,0) = {-S,0). Then 

Fikv) = F{-3, -6, -9) = (3, 0) # fcF('y) 
and hence F is not linear. 

6.13. Let V be the vector space of n-square matrices over K. Let M be an arbitrary matrix 
in F. Let r : F -* 7 be defined by T{A) = AM + MA, where AeV. Show that 
T is linear. 

For any A,BGV and any k G K, we have 

T{A +B) = (A + B)M + M{A + B) = AM + BM + MA + MB 
= {AM + MA) + {BM + MB) = T{A) + T{B) 
and T{kA) = {kA)M + M{kA) = k{AM) + k{MA) = k{AM + MA) ^ kT{A) 

Accordingly, T is linear. 

6.14. Prove Theorem 6.2: Let V and U be vector spaces over a field K. Let {^'i, ...,■?;„} be 
a basis of V and let Mi, . . . , «„ be any arbitrary vectors in U. Then there exists a 
unique linear mapping F:V-^U such that F{Vi) = Ui, F{v2) = %2, . . ., F{Vn) = Un. 

There are three steps to the proof of the theorem: (1) Define a mapping F .V ^ U such that 
F{v^ = Mi, i = l, ...,n. (2) Show that F is linear. (3) Show that F is unique. 

Step {1). Let V eV. Since {v^, . . .,i;„} is a basis of V, there exist unique scalars a^, . . .,a„ G X 

for which v = a^v^ + a^v^ -] + a„i;„. We define F:V ^ U by F{v) = a^Ui + a^u^ -\ h a„M„. 

(Since the osj are unique, the mapping F is well-defined.) Now, for i= \, ..., n, 

Vi - Ovi + ■ ■ ■ + Ivi + • ■ ■ + Ov„ 

Hence F{Vi) = Omj + • • • + Imj + • • ■ + Om„ = m; 

Thus the first step of the proof is complete. 

Step {2). Suppose v = a^Vi + a^v^ + • • • + a„'U„ and w = b^Vi + h^Vi + • • • + h^v^. Then 

V + w = (tti + hi)Vi + {a2 + b2)v2 + ■ • • + (a„ + 6„)v„ 

and, for any kG K, kv = ka^v^ + kazVz + • • ■ + ka^v^. By definition of the mapping F, 

F{v) = OiMi + a^u^ + • ■ • + a„M„ and F{w) = h^u^ + h^Vi + • ■ • + 6„m„ 

Hence F{v + w) = {a^ + h-i)ui + {a^ + 62)^2 +•••+(«„ + 6„)m„ 

= («!«, + ajMj + • • • + a„M„) + (6iMi + 62M2 + • • • + 6„M„) 
= F{v) + F(w) 

and f (fci;) = fc(oiMi + O2M2 + • • • + o„mJ = ^^(1;) 

Thus f is linear. 

Step (3). Now suppose G:V ^V is linear and G{v^ = M;, t = 1, . . ., m. If v = Oi^i + a^v^ + 
1- a^nVn, then 

G(t)) = G(aiVi + a^v^ + • • • + a„v„) - a^G{v-^ + a2G{v^ + • • • + a„G(v„) 

= OiMj + a2M2 + • • • + «„«*„ = ii'(t)) 

Since G(t)) = F{v) for every i? G V, G = F. Thus F is unique and the theorem is proved. 



136 LINEAR MAPPINGS [CHAP. 6 

6.15. Let r : R2 -» R be the linear mapping for which 

r(l, 1) = 3 and r(0, 1) = -2 {1) 

(Since {(1,1), (0,1)} is a basis of R^, such a linear mapping exists and is unique by 
Theorem 6.2.) Find T{a, b). 

First we write (a, 6) as a linear combination of (1, 1) and (0, 1) using unknown scalars x and y: 

{a, b) = x{l, 1) + j/(0, 1) (2) 

Then (a, 6) = (x, x) + (0, y) = (x,x + y) and so x-a, x + y=^b 

Solving for x and y in terms of a and b, we obtain 

X = a and y = b — a (3) 

Now using (1) and (2) we have 

T(a, b) = T{x(l, 1) + 3/(0, D) = xT{l, 1) + yT(0, 1) = 3x - 2y 

Finally, using (3) we have T(a, b) - 3x - 2y - 3(a) - 2(6 - a) = 5a - 26. 

6.16. Let T -.V^U be linear, and suppose Vi, ...,Vn&V have the property that their 
images T{vi), .... T{vn) are linearly independent. Show that the vectors vi, . . .,Vn 
are also linearly independent. 

Suppose that, for scalars ai, . . . , o„, a^v^ + ag^a + • • • + a„Vn = 0. Then 

= r(0) = Tia^vi + a.2V2 + h a„i'„) = aiT(vi) + azTiv^) + ■ ■ • + a„r(v„) 

Since the T{Vi) are linearly independent, all the O; = 0. Thus the vectors Vi,...,v„ are linearly 
independent. 

6.17. Suppose the linear mapping F:V^U is one-to-one and onto. Show that the inverse 
mapping F~^:U -*V is also linear. 

Suppose M.w'G U. Since F is one-to-one and onto, there exist unique vectors v,v'BV for 
which F{v) = u and F(v') = u'. Since F is linear, we also have 

F{v + v') - F{v) + F(v') = u + u' and F{kv) = kF(v) - ku 

By definition of the inverse mapping, F-i(tt) = v. F-i(m') = f', F-i-{u-\-u') = v + v' and 
F-^ku) = fc'U. Then 

F-Mm + m') = •« + •"' = F-i(m) + F-Mm') and F-HM = fei) = feF-Htt) 
and thus F"' is linear. 

IMAGE AND KERNEL OF LINEAR MAPPINGS 

6.18. Let F : R* -» R5 be the linear mapping defined by 

F{x,y,s,t) = {x-y + s + t,x + 2s-t,x + v + ?>s-Zt) 

Find a basis and the dimension of the (i) image U of F, (ii) kernel W of F. 
(i) The images of the following generators of R* generate the image V of F: 
F(l, 0,0,0) - (1,1,1) F(0, 0,1,0) = (1,2,3) 

F(0, 1, 0, 0) = (-1, 0, 1) F(0, 0, 0, 1) = (1, -1, -3) 

Form the matrix whose rows are the generators of U and row reduce to echelon form: 




to „ . « to 



Thus {(1, 1, 1), (0, 1, 2)} is a basis of V; hence dim C/ = 2. 





CHAP. 6] LINEAR MAPPINGS 137 

(ii) We seek the set of (x, y, s, t) such that F{x, y, s, t) = (0, 0, 0), i.e., 

F(x, y,s,t) = (x-y + s + t,x + 2s-t,x + y + Bs-St) = (0, 0, 0) 

Set corresponding components equal to each other to form the following homogeneous system 
whose solution space is the kernel W of F: 

X — y+s+t = X — y+s+t = 

X + 2s - t = or y + s - 2t = or 

y + s - 2t = 
x + y + 3s-3t = 2y + 2s - 4t = 

The free variables are s and t; hence dim W = 2. Set 

(a) s = —1, f = to obtain the solution (2, 1, —1, 0), 

(6) s = 0, t = l to obtain the solution (1, 2, 0, 1). 

Thus {(2, 1, -1, 0), (1, 2, 0, 1)} is a basis of W. (Observe that dim C/ + dim IT = 2 + 2 = 4, 
which is the dimension of the domain R* of F.) 



6.19. Let T:W^W be the linear mapping defined by 

T{x, y, z) = {x + 2y — z, y + z, X + y — 2z) 
Find a basis and the dimension of the (i) image U of T, (ii) kernel W of T. 
(i) The images of generators of R^ generate the image U of T: 

r(i,o,o) = (1,0,1), r(o,i,o) = (2,i,i), r(o, o, i) = (-1, i, -2) 

Form the matrix whose rows are the generators of U and row reduce to echelon form: 

(" ° '\ 

to 1-1 to 



1 





1 


2 


1 


1 


1 


1 


-2 



1 





1 





1 


-1 





1 


-1 



1 





1 





1 


-1 












Thus {(1, 0, 1), (0, 1, -1)} is a basis of U, and so dim U = 2. 

(ii) We seek the set of {x,y,z) such that T(x,y,z) = (0,0,0), i.e., 

T(x, y,z) = {x + 2y - z, y + z, X -\- y -2z) = (0, 0, 0) 

Set corresponding components equal to each other to form the homogeneous system whose 
solution space is the kernel W of T: 



X + 2y — z = 




x + 2y - z = 




X + 2y - z = 


y + z = a 


or 


y + z = 


or 


y + z = 


a; + y — 2z = a 




—y — z = 







The only free variable is z; hence dim W = 1. Let z — 1; then y = —1 and a; = 3. Thus 
{(3, —1, 1)} is a basis of W. (Observe that dim U + dim W = 2 + 1 = 3, which is the dimen- 
sion of the domain R3 of T.) 



6.20. Find a linear map F : R^ ^ R* whose image is generated by (1, 2, 0, —4) and (2, 0, —1, —3). 

Method 1. 

Consider the usual basis of R^: e^ = (1, 0, 0), eg = (0, 1. 0), eg = (0, 0, 1). Set F(ei) = (1, 2, 0, -4), 
F(e2) = (2, 0, —1, —3) and F{eg) = (0, 0, 0, 0). By Theorem 6.2, such a linear map F exists and is 
unique. Furthermore, the image of F is generated by the F(ej); hence F has the required property. 
We find a general formula for F(x, y, z): 

F(x, y, z) = F{xei + ye^ + zeg) = xFie-^) + yF{e2) + 2^(63) 

= x(\, 2, 0, -4) + 2/(2, 0, -1, -3) + 2(0, 0, 0, 0) 

=z (x + 2y, 2x, —y, —4x — 3y) 



138 LINEAR MAPPINGS [CHAP. 6 

Method 2. 

Form a 4 X 3 matrix A whose columns consist only of the given vectors; say, 

1 2 2\ 

A = 



2 











-1 


-1 


-4 


-3 


-3 



Recall that A determines a linear map A : R3 ^ B^ whose image is generated by the columns of A. 
Thus A satisfies the required condition. 

6.21. Let V be the vector space of 2 by 2 matrices over R and let M = | j . Let 

F:V^Y be the linear map defined by F{A^ = AM — MA. Find a basis and the 
dimension of the kernel W of F. 

We seek the set of (^ ^\ such that Fr' J = (p q) • 

K: :) - C X I) - {I DC ' 

_ /x 2x + 3y\ _ / 
\s 2s + St ) \ 

/-2s 2x + 2y- 2t\ _ /« 
\-2s 2s y ~ \0 

Thus 2x + 2y -2t = x + y - t = 

or 
2s = s = 

The free variables are y and t; hence dim W — 2. To obtain a basis of W set 
(a) y — —1, t = to obtain the solution x = 1, y = —1, s = 0, t = 0; 
(6) y — 0, t — 1 to obtain the solution x = 1, y = 0, s = 0, t = 1. 



t 

X + 2s y + 2t 
3s 3( 



^^"^{(o o)' G ;)} i« a basis of T^. 



6.22. Prove Theorem 6.3: Let F.V^U be a linear mapping. Then (i) the image of F 
is a subspace of U and (ii) the kernel of F is a subspace of V. 

(i) Since F(Q) = 0, G Im F. Now suppose u, u' GlmF and a,b& K. Since u and u' belong to 
the image of F, there exist vectors v,v' GV such that F(v) = u and F(v') = u'. Then 

F{av + bv') - aF(v) + hF(v') = au + bu' e Im F 

Thus the image of F is a subspace of U. 

(ii) Since F(0) = 0, G Ker F. Now suppose v,wG Ker F and a,b e K. Since v and w belong 
to the kernel of F, F{v) = and F(w) = 0. Thus 

F(av + bw) = aF(v) + bF{w) ==: aO + 60 = and so av + bw S KerF 

Thus the kernel of F is a subspace of V. 



6.23. Prove Theorem 6.4: Let V be of finite dimension, and let F:V-^U be a linear map- 
ping with image U' and kernel W. Then dim U' + dim W = dim V. 

Suppose dim V = n. Since W ia a. subspace of V, its dimension is finite; say, dim W = r — n. 
Thus we need prove that dim U' = n — r. 



CHAP. 6] LINEAR MAPPINGS 139 

Let {wi, . . . , Wr) be a basis of W. We extend {wj to a basis of V: 

{w'l Wr,Vi, ...,i;„_J 

Let B = {F{Vi),F(v2), ...,F(v„^r)} 

The theorem is proved if we show that B is a basis of the image U' of F. 

Proof that B generates U'. Let u S U'. Then there exists v &V such that F(v) — u. Since 
{Wj, Vj} generates V and since v S V, 

V — OjWj + • • • + a,Wr + ^l'"! + • • • + b^-r'^n-r 

where the a„ ftj are scalars. Note that F(Wi) — since the Wj belong to the kernel of F. Thus 
u = jF'(t') = F(aiici + • • • + a^Wf + biv^ + • • • + b„^^v„-r) 
= aiF{wi) + ■■■ + a^(Wr) + b^Fivi) + ■■■ + 6„_^F(i;„_,) 

= OjO + • • • + a^O + biF(Vi) + • • ■ + bn-rF(Vn-r) 

= b,F(v,) +■■■+ 6„_,FK_,) 

Accordingly, the F{Vf) generate the image of F. 

Proof that B is linearly independent. Suppose 

a^Fivi) + a.2F(v2) + • • • + a„_ri^K_,.) = 

Then F(aiVi + 02^2 + • • • + a^^^v^-r) = and so a^Vi + • • • + a„_,T;„_^ belongs to the kernel W 
of F. Since {wj generates W, there exist scalars 61, . . . , 6^ such that 

a^Vi + a2^'2 + • • • + an-r'Un-, = b^Wi + 62^2 + • • • + b^Wr 

or ail^i + • • • + an-r'Wn-r — b^Wi — • ■ • — fe^w^ = (*) 

Since {tWj, ■«{} is a basis of V, it is linearly independent; hence the coefficients of the W; and Vj in (*) 
are all 0. In particular, Oj = 0, . . ., a„_r = 0- Accordingly, the F(v^ are linearly independent. 

Thus B is a basis of V, and so dim V — n — r and the theorem is proved. 



6.24. Suppose f:V-*U is linear with kernel W, and that f{v) = u. Show that the "coset" 
V + W = {v + w: w e W} is the preimage of u, that is, f~^{u) — v + W. 

We must prove that (i) f~Hu)cv + W and (ii) v + T^ c/-i(m). We first prove (i). Suppose 
v'Gf-Hu). Then f(v') = u and so f(v' - v) = f(v') - f{v) = u-u = 0, that is, v'-vGW. Thus 
v' = V + (v' — v) €. V + W and hence f~Hu) Cv + W. 

Now we prove (ii). Suppose v' G v+W. Then v' = 1; + w where w G W. Since W is the 
kernel of /, f(w) = 0. Accordingly, f{v') = /(-u + w) = f(v) + f(w) = /(t)) + = f(v) = m. Thus 
v' e /-i(m) and so v + Wc f-^(u). 



SINGULAR AND NONSINGULAR MAPPINGS 

6.25. Suppose F:V ^U is linear and that V is of finite dimension. Show that V and the 
image of F have the same dimension if and only if F is nonsingular. Determine all 
nonsingular mappings T : R* ^ R^. 

By Theorem 6.4, dim F = dim (Im/f) + dim (Ker/i^). Hence V and ImF have the same di- 
mension if and only if dim (Ker F) = or KerF = {0}, i.e. if and only if F is nonsingular. 

Since the dimension of R^ is less than the dimension of R*, so is the dimension of the image of 
T. Accordingly, no linear mapping T : B* -» R^ can be nonsingular. 

6.26. Prove that a linear mapping F:V-*U is nonsingular if and only if the image of 
an independent set is independent. 

Suppose F is nonsingular and suppose {v^, . . ., v^} is an independent subset of V. We claim that 

the vectors F(vi) F(vJ are independent. Suppose aiF{Vi) + a<^{v^ +;•••+ a„F{v„) - 0, where 

a, e X. Since F is linear, F(ajVi + a^v^, + • • • + o^vj = 0; hence 

a^Vy + 021^2 + • • • + On^n ^ Ker F 



140 LINEAR MAPPINGS 



[CHAP. 6 



But F is nonsingular, i.e. Ker F = {0}; hence a^v^ + a^v^ +■■■ + a„v„ = 0. Since the i;; are linearly 
independent, all the a^ are 0. Accordingly, the F(v>i are linearly independent. In other words, the 
image of the independent set {v^, . . . , i)„} is independent. 

On the other hand, suppose the image of any independent set is independent. If v G V is 
nonzero, then {v} is independent. Then {F{v)} is independent and so F(v) # 0. Accordingly, F is 
nonsingular. 

OPERATIONS WITH LINEAR MAPPINGS 

6.27. Let F:W^W and G:W^ W be defined by F{x, y, z) = {2x, y + z) and G{x, y, z) = 
(x -z,y). Find formulas defining the mappings F + G,ZF and 2F - 5G. 

(F + G)(x, y, z) = F(x, y, z) + G(x, y, z) 

= (2x, y + z) + (x — z,y) = (3x -z,2v + z) 
(3F)(a;, y, z) = ZF(x, y, z) = 3(2*, y + z) = {Qx, By + 3z) 
(2F - 5G){x, y, z) = 2F(x, y, z) - 5G{x, y, z) = 2(2a;, y + z) - 5{x - z, y) 

= (Ax, 2y + 22) + (-5a; + 5z, -5y) = (-x + 5z, -2,y + 2z) 

6.28. Let F:W-^W and G/R'^W be defined by F(x,y,z) = {2x,y + z) and G{x,y) - 
{y, x) . Derive formulas defining the mappings G°F and FoG. 

(GoF){x,y,z) = G(F{x,y,z)) = G{2x, y + z) = (y + z, 2x) 
The mapping F ° G is not defined since the image of G is not contained in the domain of F. 

6.29. Show: (i) the zero mapping 0, defined by 0{v) = for every v GV, is the zero ele- 
ment of Hom(F, U); (ii) the negative of F G Hom(7, U) is the mapping {-1)F, i.e. 
-F = (-l)F. 

(i) Let F G Hom {V, U). Then, for every v GV, 

{F + Q){v) = F(,v) + 0{v) = F{v) + = F(v) 
Since (F + 0)(v) = F(v) for every v eV, F + = F. 

(ii) For every v G V, 

[F + {-l)F){v) = F{v) + {-l)F{v) = F{v) - F{v} = = 0{v) 

Since {F + {-l)F){v) = 0(v) for every vGV, F + (-l)F = 0. Thus (-l)F is the negative 
of F. 

6.30. Show that for Fi, ...,F„G Hom {V, U) and ai, ...,a„GK, and for any vGV, 

{aiFi + a2F2 H + a„F„)(i;) = aiFi{v) + aJFiiv) + ■ ■ • + ajf'niv) 

By definition of the mapping aiFj, (a^F^iv) = a^F^{v); hence the theorem holds for n = 1. 
Thus by induction, 

(aiFi + (I2F2 + • • ■ + a„F„)(i;) = (a^F^)(v) + {a^F^ + • ■ • + a„F„)(i;) 

= aiFiCv) + a^F^iv) + • • • + a„F„(D) 

6.31. Let /^:R3^R2, G.W^B? and HrR^^R^ be defined by i^'Cx, i/, 2) = {x + y + z,x + y), 

G{x, y, z) = {2x + z,x + y) and i?(a;, y, z) = {2y, x). Show that F,G,H G Hom (RS R2) 

are linearly independent. 

Suppose, for scalars a,b,c G K, 

aF + bG + cH = {1) 

(Here is the zero mapping.) For e^ = (1, 0, 0) G R3, we have 

(aF + bG + cH)(e{) = aF(l, 0, 0) + bG(l, 0, 0) + cH(l, 0, 0) 

= a(l, 1) + 6(2, 1) + c(0, 1) = (a + 2b,a + b + c) 



CHAP. 6] LINEAR MAPPINGS 141 

and 0(ei) = (0, 0). Thus by {!), (a + 2b, a+b + e) = (0, 0) and so 

a + 26 = and a + 6 + c = (2) 

Similarly for eg = (0, 1, 0) e R3, we have 

(aF + bG + cH){e2) = aF(0, 1, 0) + 6G(0, 1, 0) + cH(0, 1, 0) 

= a(l, 1) + 6(0, 1) + c(2, 0) = (a+2c, a+6) = 0(62) = (0,0) 

Thus a + 2e = and a + 6 = (5) 

Using (2) and (5) we obtain a = 0, 6 = 0, c = (■*) 

Since (1) implies (4), the mappings F, G and H are linearly independent. 

6.32. Prove Theorem 6.7: Suppose dim y = m and dim U = n. Then dim Hom {V, U) - mn. 

Suppose {vi, . . .,v„} is a basis of V and {mj, . . .,m„} is a basis of V. By Theorem 6.2, a linear 
mapping in Hom {V, V) is uniquely determined by arbitrarily assigning elements of t/ to the basis 
elements Vj of V. We define 

F^ e Hom {V,U), i = 1, . . . , m, j = 1, ...,n 

to be the linear mapping for which Fij{v^ = Uj, and Fij(Vk) -0 for fe # i. That is, Fy maps Vi 
into Mj and the other v's into 0. Observe that {Fy} contains exactly mn elements; hence the theorem 
is proved if we show that it is a basis of Hom {V, U). 

Proof that {Fy} generates Hom (F, U). Let F G Hom {V, U). Suppose F{vi) = w^, F(v2) = 
W2, ..., F(Vm) = Wm- Since w^ G U, it is a linear combination of the u's; say, 

Wk = afclMl + «fc2«*2 + • • • + fflfc„Mn> fc = 1, . . . , m, Oy G X (i) 



TTi n 



Consider the linear mapping G = 2 2 ayFy Since G is a linear combination of the Fy, the 

i=l i=l 

proof that {Fy} generates Hom (V, t7) is complete if we show that F = G. 

We now compute G(Vk), k = l, ...,m. Since Fy('Ufc) = for k^i and ^^((Vfc) = Mj, 

m n n t 

G(i;k) = 22 OiiF«('yic) = 2 OfciJ^)cj(vic) = 2 Ofci«j 

i=l 3 = 1 3 = 1 3 = 1 

= a^iMj + ak2'"-2 + • • • + »fcnMn 

Thus by (1), G{v^,) = w^. for each k. But ^(1;^) = w^ for each fe. Accordingly, by Theorem 6.2, 
F = G; hence {Fy} generates Hom (V, U). 

Proof that {Fy} is linearly independent. Suppose, for scalars ay G K, 



2 2 »«^« - 

i=l 3 = 1 



For i;^, fc = 1, . . .,w, 



= 0(v^) = 22 «ii^ij(^ic) = 2 a^jF^j(v^) = 2 aicjMi 

i = l j = l 3 — 1 3 — i 

= afcl^l + ak2M2 + • • • + fflfen^n 

But the Mj are linearly independent; hence for k = 1, . . .,m, we have a^i — 0, 0^2 = 0, . . . , ajj„ = 0. 
In other words, all the ay = and so {Fy} is linearly independent. 

Thus {Fy} is a basis of Hom (V, 17); hence dim Hom {V, U) = mn. 

6.33. Prove Theorem 6.8: Let V, U and W be vector spaces over K. Let F, F' be linear 
mappings from V into f7 and let G, G' be linear mappings from U into W; and let 
k&K. Then: (i) Go(F + F') - Goi?' + Goii''; (ii) {G + G')oF = GoF + G'oF; 
(iii) fcCGoF) = {kG)oF = Go(fcF). 

(i) For every v GV, 



142 LINEAR MAPPINGS [CHAP. 6 

(Go(F + F'mv) = G{(F + F'){v)) = G{F(v) + F'(v)) 

= G(F{v)) + G{F'(v)) = {G'>F)(v) + {GoF')(v) = {G ° F + G o F'){v) 
Since {G ° (F + F'){v) = (G o F + G ° F'){v) for every vGV, Go {F + F') = G°F + G°F'. 
(ii) For every v &V, 

{(G + G')°F)(v) = {G + G')(F{v)) = G{F{v)) + G'{F(v)) 

= (Go F)(v) + {G' °F){v) = (G ° F + G' o F)(v) 
Since ({G + G') ° F}(v) = {G ° F + G ° F')(v) for every v&V, (G + G')° F = G°F + G' °F. 
(iii) For every v GV, 

(k{G°F))(v) = k(G°F){v) = k{G{F(v))) = {kG)(F{v)) = (feG°F)(i;) 
and {k{G°Fmv) = k(GoF){v) = k(G{F(v))) = G{kF{v)) = G{(kF){v)) = {G°kF){v) 

Accordingly, k{G°F) = (kG)oF = G°(kF). (We emphasize that two mappings are shown to 
be equal by showing that they assign the same image to each point in the domain.) 

6.34. Let F:V^V and G.U^W be linear. Hence {GoF):V^W is linear. Show that 
(i) rank {GoF) ^ rank G, (ii) rank (GoF) ^ rank F. 

(i) Since F{V) c U, we also have G(F{V)) c G(U) and so dim G(F{V)) ^ dim G(V). Then 
rank (GoF) = dim ((GoF)(y)) = dim (G(F(y))) ^ dim G(?7) = rank G 

(ii) By Theorem 6.4, dim (G(F(y))) ^ dim F(y). Hence 

rank (GoF) = dim ((Go F)(y)) = dim (G(F(y))) =£ dim F(y) = rank F 

ALGEBRA OF LINEAR OPERATORS 

6.35. Let S and T be the linear operators on R^ defined by S{x, y) = {y, x) and T{x, y) = 
(0, x). Find formulas defining the operators S + T,2S- ZT, ST, TS, S^ and T^. 

{S+T){x,y) = S(x,y) + T(x,y) = {y,x) + (0,x) = {y,2x). 

(2S-ZT)(x,y) = 2S(x,y)-3T{x,y) = 2{y,x) - Z((i,x) = (2y,-x). 

(ST)(x,y) = S{.T(x,y)) = S(f),x) - (a;,0). 

(TS)(x,y) = T(S(x,y)) = T(y,x) = {0,y). 

SHx,y) = S{S{x,y)) = S{y,x) = (x,y). Note S^ = I, the identity mapping. 

THx, y) = T(T(x, y)) = 7(0, x) - (0, 0). Note T^ = 0, the zero mapping. 

6.36. Let T be the linear operator on R^ defined by 

r(3, 1) = (2, -4) and T{1, 1) = (0, 2) (i) 

(By Theorem 6.2, such a linear operator exists and is unique.) Find T{a, b). In 
particular, find r(7, 4). 

First write (a, 6) as a linear combination of (3,1) and (1, 1) using unknown scalars x and y: 

(a, h) = a;(3, 1) + y(l, 1) (,2) 

'Zx + y = a 



Hence (a, b) = {Sx, x) + (y, y) = {Sx + y, x + y) and so 

[^ X + y — b 

Solving for x and y in terms of a and b, 

X = ^o — ^6 and y = -|a + f 6 (5) 

Now using (2), {1) and (3), 

T(a, b) = xT{3, 1) + yT(l, 1) = oo(2, -4) + 2/(0, 2) 

= {2x, -4x) + (0, 2y) = (2a;, -4a; + 2y) = (a-b,5b- 3a) 
Thus m, 4) = (7 - 4, 20 - 21) = (3, -1). 



CHAP. 6] LINEAR MAPPINGS 143 

6.37. Let T be the operator on R^ defined by T{x, y, z) = {2x, 4a; — y,2x + 3y-z). (i) Show 
that T is invertible. (ii) Find a formula for T~^. 

(i) The kernel W of T is the set of all (x, y, z) such that T{x, y, z) = (0, 0, 0), i.e., 

T(», y, z) = (2a;, 4x -y,2x + Sy-z) = (0, 0, 0) 
Thus W is the solution space of the homogeneous system 

2a; = 0, 4x - y = 0, 2x + Sy - z = 
which has only the trivial solution (0, 0, 0). Thus W — {0}; hence T is nonsingular and so by 
Theorem 6.9 is invertible. 

(ii) Let (r, s, t) be the image of (x, y, z) under T; then (a;, y, z) is the image of (r, s, t) under T^k 
T(x, y, z) = (r, s, f) and T-^r, s, t) = (x, y, z). We will find the values of x, y and z in terms 
of r, s and t, and then substitute in the above formula for T~^. From 

T{x, y, z) = (2a;, 4a; -y,2x + 3y-z) = {r, s, t) 

we find X = ^r, y = 2r — s, z = 7r — Ss — t. Thus T~^ is given by 

r-i(r, s, t) = (^r,2r-s,lr-3s-t) 

6.38. Let V be of finite dimension and let T be a linear operator on V. Recall that T is 
invertible if and only if T is nonsingular or one-to-one. Show that T is invertible if 
and only if T is onto. 

By Theorem 6.4, dim V = dim (Im T) + dim (Ker T). Hence the following statements are 
equivalent: (i) T is onto, (ii) Im T = V, (iii) dim (Im r) = dimV, (iv) dim (Ker T) = 0, 
(v) Ker T = {0}, (vi) T is nonsingular, (vii) T is invertible. 

6.39. Let V be of finite dimension and let T be a linear operator on V for which TS = I, 
for some operator S on V. (We call S a right inverse of T.) (i) Show that T is 
invertible. (ii) Show that S = T~^. (iii) Give an example showing that the above 
need not hold if V is of infinite dimension. 

(i) Let dim V = n. By the preceding problem, T is invertible if and only if T is onto; hence T 
is invertible if and only if rank T = n. We have n = rank / = rank TS — rank T — n. 
Hence rank T = n and T is invertible. 

(ii) rr-i = r-ir = /. Then s = /s = (r-ir)s = r-i(rs) = r-»/= r-i. 

(iii) Let V be the space of polynomials in t over K; say, p(t) = ao + Oji + Ojf^ + • • ■ + a„t". Let T 
and S be the operators on V defined by 

T(p{t)) = + ai + a2t+ ■■■ + a„«""i and S{p(t)) = a^t + a^t^ + • • ■ + a„t«+i 

We have (rS)(p(«)) = T(S{p{t))) = r(aot + Oit2 + • • • + a„<»+i) 

= Oo + ajt + • • • + o„<« = p{t) 

and so TS = I, the identity mapping. On the other hand, ii k G K and fc # 0, then (,ST){k) = 
S(T(k)) = S(0) = 0¥'k. Accordingly, ST ¥= I. 

6.40. Let S and T be the linear operators on R^ defined by S{x, y) = (0, x) and T{x, y) - 
{x, 0). Show that TS = but ST # 0. Also show that T^ = T. 

(TS){x,y) = T(S{x,y)) = T{0,x) = {0,0). Since TS assigns = (0,0) to every (a;,j/)GR2, it 
is the zero mapping: TS = 0. 

(ST){x,y)=S(T{x,y)) = S(x,0) = (0,x). For example, (Sr)(4, 2) = (0, 4). Thus ST ¥- 0, since 
it does not assign = (0, 0) to every element of R^. 

For any {x,y) € R2, T^x.y) = T(T(x,y)) = T{x,0) = {x,Q) = T{x,y). Hence T^ =- T. 




144 LINEAR MAPPINGS [CHAP. 6 

MISCELLANEOUS PROBLEMS 

6.41. Let {ei, ei, 63} be a basis of Y and {/i, /z) a basis of TJ. Let T-.V^U be linear. 
Furthermore, suppose 

T{ei) = aifi + 02/2 

2^(62) = &i/i + b2/2 and A = ('''/ ''^ 

T(e3) - Ci/i + C2/2 ^ 

Show that, for any v GV, A[v]e = [T{v)]f where the vectors in K^ and K^ are written 
as column vectors. 

h\ 

Suppose V = fejei + fc2e2 + ^363; then [f]e = ^2 ■ Also, 

T(v) = kiT{ei) + k^T^e^) + ksTie^) 

- kiiaJi + 0^2) + kiibJi + 62/2) + hi'ifi + "2/2) 

= (Olfcl + feifeg + Cifcg)/! + {a2ki + 62^2 + C2fc3)/2 

Accordmgly, [Tiv)], = { ^,\ + ,[,1 + :^,l 

Computing, we obtain 

^Me =(„_._ )lfc2) - [a,k, + b,k2+c,kj - f^(^)l' 



6.42. Let A; be a nonzero scalar. Show that a linear map T is singular if and only if kT 
is singular. Hence T is singular if and only if —T is singular. 

Suppose T is singular. Then T{v) = for some vector v ¥= 0. Hence ikT){v) = kT(v) = &0 = 
and so kT is singular. 

Now suppose kT is singular. Then {kT^(w) — for some vector w # 0; hence ^(fcw) = 
kT(w) = (kT)(w) = 0. But fc # and w #^ implies kw ¥= 0; thus T is also singular. 

6.43. Let £• be a linear operator on V for which E^ = E. (Such an operator is termed a 
projection.) Let C/ be the image of E and W the kernel. Show that: (i) if m G C/, 
then £'(m) - u, i.e. £7 is the identity map on U; (ii) if E ^I, then ^ is singular, i.e. 
E{v) = for some v^O; (iii) V = U®W. 

(i) If u&TJ, the image of S, then E{v) = u for some v GV. Hence using E^ — E, we have 

u ^ E(v) = EHv) = E(E{v)) = E{u) 

(ii) U E ¥= I then, for some v £ F, Bl'y) = u where v v^ m. By (i), E(u) - u. Thus 
E'(v — m) = S(v) — S(m) = m — m = where v — u¥=0 

(iii) We first show that V - U + W. Let v e y. Set m = E(v) and w = v- E{v). Then 

■U = £(l>) + I) — £'('U) = M + Of 

By definition, m = E{v) S J7, the image of E. We now show that w e TF, the kernel of E: 

E(w) = E{v - E(v)) = E(v) - E^v) = E(v) - E(v) - 

and thus w G W. Hence V = U + W. 

We next show that UnW - {0}. Let v eUnW. Since vGU, E(v) = v by (i). Since 
v&W, E{v) = 0. Thus V = E(v) = and so UnW - {0}. 

The above two properties imply that V = U ® W. 



CHAP. 6] 



LINEAR MAPPINGS 



145 



6.44. Show that a square matrix A is invertible if and only if it is nonsingular. (Compare 
with Theorem 6.9, page 130.) 

Recall that A is invertible if and only if A is row equivalent to the identity matrix 7. Thus the 
following statements are equivalent: (i) A is invertible. (ii) A and 1 are row equivalent, (iii) The 
equations AX = and IX = have the same solution space, (iv) AX = has only the zero solu- 
tion, (v) A is nonsingular. 



Supplementary Problems 



MAPPINGS 

6.45. State whether each diagram defines a mapping from {1, 2, 3} into {4, 5, 6}. 






6.46. Define each of the following mappings / : R -> R by a formula: 
(i) To each number let / assign its square plus 3. 

(ii) To each number let / assign its cube plus twice the number. 

(iii) To each number — 3 let / assign the number squared, and to each number < 3 let / assign 
the number —2. 

6.47. Let /:R^R be defined by f(x) = x^-4x + 3. Find (i) /(4), (ii) /(-3), (iii) /(j/ - 2a;), (iv)/(a!-2). 

6.48. Determine the number of different mappings from {o, 6} into {1, 2, 3}. 

6.49. Let the mapping g assign to each name in the set {Betty, Martin, David, Alan, Rebecca} the number 
of different letters needed to spell the name. Find (i) the graph of g, (ii) the image of g. 

6.50. Sketch the graph of each mapping: (i) f(x) = ^x — 1, (ii) g(x) = 2x^ — 4x — 3. 

6.51. The mappings f:A-^B, g:B-^A, h:C-*B, F-.B^C and GiA^C are illustrated in the 
diagram below. 




Determine whether each of the following defines a composition mapping and, if it does, find its 
domain and co-domain: {\)g°f, {n)h°f, (iii) Fo/, (iv)G°f, {y)g°h, (vi) h°G°g. 

6.52. Let /:R^R and fir : R -^ R be defined by f(x) = x^ + Sx + l and g(x) = 2x-3. Find formulas 
defining the composition mappings (i) f°g, (ii) g°f, (iii) g°g, (iv) f°f. 



6.53. For any mapping f:A->B, show that 1b° f — f — f°'^A- 



146 LINEAR MAPPINGS [CHAP. 6 

6.54. For each of the following mappings / : R -> R find a formula for the inverse mapping: (i) f{x) = 
Sx - 7, (ii) fix) = x» + 2. 

LINEAR MAPPINGS 

6.55. Show that the following mappings F are linear: 
(i) F : R2 ^ R2 defined by F(x, y) = {2x - y, x). 
(ii) F : R3 -» R2 defined by F{x, y, z) = {z,x + y). 
(iii) jF : R -> R2 defined by F(x) = (2.x, Zx). 

(iv) F : R2 ^ R2 defined by F(x, y) = [ax + hy, ex + dy) where a, 6, c, d e R. 

6.56. Show that the following mappings F are not linear: 
(i) F : R2 ^ R2 defined by F(x, y) = (x^, y^). 

(ii) F : R3 ^ R2 defined by Fix, y,z) = ix + l,y + z). 
(iii) F : R ^ R2 defined by Fix) = ix, 1). 
(iv) F:R2->R defined by Fix,y) = \x-y\. 

6.57. Let V be the vector space of polynomials in t over K. Show that the mappings T :V -*V and 
S :V -> V defined below are linear: 

Tiaa + ait + • • • + a^t") = a^t + a^t^ + ■ • • + a„t" + i 

S(ao + ai« + • • • + a„t") = Q + ax + a^t + ■ ■ ■ + aj"--^ 

6.58. Let V be the vector space ot nXn matrices over K; and let M be an arbitrary matrix in V. Show 
that the first two mappings T -.V ^V are linear, but the third is not linear (unless M = 0): 
(i) r(A) = MA, (ii) TiA) - MA -AM, (iii) TiA) =^ M + A. 

6.59. Find Tia, b) where T : R2 ^ R3 is defined by r(l, 2) = (3, -1, 5) and r(0, 1) = (2, 1, -1). 

6.60. Find Tia, b, c) where T : RS ^ R is defined by 

Til, 1, 1) = 3, r(0, 1, -2) = 1 and ^(O, 0, 1) = -2 

6.6L Suppose F:V -*U is linear. Show that, for any vGV, Fi-v) = -Fiv). 

6.62. Let W be a subspace of V. Show that the inclusion map of W into V, denoted by i:W cV and 
defined by t(w) = w, is linear. 

KERNEL AND IMAGE OF LINEAR MAPPINGS 

6.63. For each of the following linear mappings F, find a basis and the dimension of (a) its image U 
and (6) its kernel W: 

(i) F : R3 -> R8 defined by F(x, y, z) = ix + 2y,y-z,x + 2z). 
(ii) F : R2 ^ R2 defined by Fix,y) = ix + y,x + y). 
(iii) F : R3 ^ R2 defined by Fix, y,z) - ix + y,y + z). 

6.64. Let V be the vector space of 2 X 2 matrices over R and let M = f j . Let F : V -* V be the 

linear map defined by FiA) = MA. Find a basis and the dimension of (i) the kernel TF of F and 
(ii) the image U of F. 

6.65. Find a linear mapping F : R3 ^ RS whose image is generated by (1, 2, 3) and (4, 5, 6). 

6.66. Find a linear mapping F : R* ^ RS whose kernel is generated by (1, 2, 3, 4) and (0, 1, 1, 1). 

6.67. Let V be the vector space of polynomials in t over R. Let D:V -*V be the differential operator: 
Dif) = df/dt. Find the kernel and image of D. 

6.68. Let F:V-^U be linear. Show that (i) the image of any subspace of y is a subspace of U and 
(ii) the preimage of any subspace of U is a subspace of V. 



CHAP. 6] LINEAR MAPPINGS 147 

6.69. Each of the following matrices determines a linear map from K,* into R^: 

'12 1^ 

(i) A = ( 2 -1 2 -1 I (ii) B = 




^1 -3 2 -2/ 
Find a basis and the dimension of the image U and the kernel W of each map. 

6.70. Let r : C -> C be the conjugate mapping on the complex field C. That is, T(z) = z where z G C, 
or T(a + bi) = a— bi where a, 6 e R. (i) Show that T is not linear if C is viewed as a vector 
space over itself, (ii) Show that T is linear if C is viewed as a vector space over the real field R. 

OPERATIONS WITH LINEAR MAPPINGS 

6.71. Let iJ' : R3 -» R2 and G : R^ ^ R2 be defined by F{x, y, z) = (y,x + z) and G(x, y, z) = (2«, x - y). 
Find formulas defining the mappings F + G and SF — 20. 

6.72. Let H : R2 -♦ R2 be defined by H(x, y) — (y, 2x). Using the mappings F and G in the preceding 
problem, find formulas defining the mappings: (i) H°F and H °G, (ii) F°H and G°H, 
(in) Ho(F + G) and HoF + H°G. 

6.73. Show that the following mappings F, G and H are linearly independent: 
(i) F,G,He Horn (R2, R2) defined by 

Fix, y) = {x, 2y), G{x, y) = {y,x + y), H{x, y) = (0, x). 
(ii) F,G,He Hom (R3, R) defined by 

F{x, y, z) = x + y + z, G(x, y,z) — y + z, H(x, y, z) = x — z. 

6.74. For F,G & Rom {V, U), show that rank (F + G) ^ rank i^ + rank G. (Here V has finite 
dimension.) 

6.75. Let F :V -^ U and G:U-*V be linear. Show that if F and G are nonsingular then G°F is 
nonsingular. Give an example where G°F is nonsingular but G is not. 

6.76. Prove that Hom (V, U) does satisfy all the required axioms of a vector space. That is, prove 
Theorem 6.6, page 128. 

ALGEBRA OP LINEAR OPERATORS 

6.77. Let S and T be the linear operators on R2 defined by S{x, y) — {x + y, 0) and T{x, y) = (—y, x). 
Find formulas defining the operators S + T, 5S - ST, ST, TS, S^ and T^. 

6.78. Let T be the linear operator on R2 defined by T{x, y) — {x + 2y, 3x + Ay). Find p(T) where 
p{t) = t2 _ 5f _ 2. 

6.79. Show that each of the following operators T on R^ is invertible, and find a formula for T~h 

(i) T{x, y, z) = (x-3y- 2z, y - 4«, z), (ii) T{x, y,z) = {x + z,x- z, y). 

6.80. Suppose S and T are linear operators on V and that S is nonsingular. Assume V has finite dimen- 
sion. Show that rank (ST) = rank (TS) = rank T. 



6.81. Suppose V = U ® W. Let Ei and E2 be the linear operators on V defined by Ei(v) = u, 
E2(v) = w, where v = u + w, ue.U,w&W. Show that: (i) Bj = E^ and eI = E2, i.e. that Ei 
and £?2 are "projections"; (ii) Ei + E2 — I, the identity mapping; (iii) E1E2 = and E2E1 = 0. 



6.82. Let El and E2 be linear operators on V satisfying (i), (ii) and (iii) of Problem 6.81. Show that V 
is the direct sum of the image of E^ and the image of £2- ^ — Im £?i © Im £2- 

6.83. Show that if the linear operators S and T are invertible, then ST is invertible and (ST)-^ = T-^S-^. 



148 LINEAR MAPPINGS [CHAP. 6 

6.84. Let V have finite dimension, and let T be a linear operator on V such that rank (T^) = rank T. 
Show that Ker TnlmT = {0}. 

MISCELLANEOUS PROBLEMS 

6.85. Suppose T -.K^-^ X"» is a linear mapping. Let {e^, . . . , e„} be the usual basis of Z" and let A be 
the mXn matrix whose columns are the vectors r(ei), . . ., r(e„) respectively. Show that, for every 
vector V G R"-, T(v) = Av, where v is written as a column vector. 

6.86. Suppose F -.V -* U is linear and fc is a nonzero scalar. Show that the maps F and kF have the 
same kernel and the same image. 

6.87. Show that if F:V -^ U is onto, then dim U - dim V. Determine all linear maps T:W-*R* 
which are onto. 

6.88. Find those theorems of Chapter 3 which prove that the space of w-square matrices over K is an 
associative algebra over K. 

6.89. Let T :V ^ U be linear and let W he a subspace of V. The restriction of T to W is the map 
Tt^-.W^U defined by r^(w) = T{w), for every wGW. Prove the following, (i) T^^ is linear. 
(ii) Ker T^ = Ker T n W. (iii) Im T^r = T(W). 

6.90. Two operators S, T G A(V) are said to be similar if there exists an invertible operator P G A{V) 
for which S = P-^TP. Prove the following, (i) Similarity of operators is an equivalence relation. 
(ii) Similar operators have the same rank (when V has finite dimension). 



Answers to Supplementary Problems 

6.45. (i) No, (ii) Yes, (iii) No. 

6.46. (i) fix) = x2 + 3, (ii) /(») = a;3 + 2a;, (iii) fix) = {"^ if » - 3 

[-2 if a; < 3 

6.47. (i) 3, (ii) 24, (iii) j/2 - 4xy + 4x2 ^^y + ^x + S, (iv) a;^ - 8a! + 15. 

6.48. Nine. 

6.49. (i) {(Betty, 4), (Martin, 6), (David, 4), (Alan, 3), (Rebecca, 5)}. 
(ii) Image of g - {3, 4, 5, 6}. 

6.51. (i) {g o f) : A -* A, (ii) No, (iii) (F o /) : A -» C, (iv) No, (v) (goh) :C ^ A, (yi) {h°G°g) iB ^ B. 

6.52. (i) (/ ° g){x) = 4a;2 - 6a; + 1 (iii) (g o g)(x) =Ax-9 

(ii) (f^°/)(a;) = 2a;2 + 6a;-l (iv) (/ o /)(a;) = a;* + Ga;* + 14a;2 + 15x + 5 

6.54. (i) f-Hx) = (x + 7)/3, (ii) /-!(«) = V^^^^. 

6.59. T{a, b) = {-a + 26, -3a + 6, 7a - 6). 

6.60. T{a, b, c) = 8o - 36 - 2c. 

6.61. F(v) + F{-v) = F(v + (-■ u)) = F(0) = 0; hence F(-v) - -F{v). 

6.63. (i) (a) {(1, 0, 1), (0, 1, -2)}, dim U = 2; (6) {(2, -1, -1)}, dim W = \. 
(ii) (a) {(1, 1)}, dim ?7 = 1; (6) {(1, -1)}, dim T^ = 1. 
(iii) (a) {(1, 0), (0, 1)}, dim t/ = 2; (6) {(1, -1, 1)}, dim W = \. 



CHAP. 6] LINEAR MAPPINGS 149 

6.64. (i) U^ ")' (o i)| I'asisof KerF; dim(KerF) = 2. 
(") {(_2 l)' (I _2)| ^^sisof ImF; dim(ImF) = 2. 

6.65. F(x, y, z) = {x + 4y, 2x + 5y, Sx + 6y). 

6.66. F(x, y,z,w) = (x + y - z, 2x + y - w, 0). 

6.67. The kernel of D is the set of constant polynomials. The image of D is the entire space V. 

6.69. (i) (a) {(1,2,1), (0,1,1)} basis of Im A; dim(ImA) = 2. 

(6) {(4, -2, -5, 0), (1, -3, 0, 5)} basis of KerA; dim(KerA) = 2. 
(ii) (a) ImB = R3; (6) {(-1,2/3,1,1)} basis of KerB; dim(KerB) = l. 

6.71. (F + G)(,x, y, z) = (y + 2z, 2x-y + z), (3F - 2G)(x, y, z) = (3j/ -Az,x + 2y + Zz). 

6.72. (i) (H°F){x,y,z) = {x + z,2y), (H°G)(x,y,z) - {x-y.iz). (il) Not defined. 
(iii) (Ho(F + G)){x, y,z) = {HoF + Ho G)(x, y, z) = {2x-y + z, 2y + 4«). 

6.77. (S + T)(x, y) = (x, x) (ST){x, y) =z {x- y, 0) 
(5S - 3r)(a;, y) = (5a; + 8y, -3x) (TS){x, y) = (0, a; + y) 

SHx, v) = (x + y, 0); note that S^ = S. 

Ti(x^ y) = {-X, -y); note that T^-\-I = Q, hence T is a zero of x'^ + 1. 

6.78. v{T) = 0. 

6.79. (i) T-Hr, s, t) = (14* + 3s + r, 4t + s, t), (ii) T-^r, s, t) = (^r + ^s, t, ^r - |s). 
6.87. There are no linear maps from RS into R* which are onto. 



chapter 7 



Matrices and Linear Operators 

INTRODUCTION 

Suppose {ei, . . . , e„} is a basis of a vector space V over a field K and, for v GV, suppose 
V = ttiei + 0.262 + • • • + Omen. Then the coordinate vector of v relative to {ei}, which we write 
as a column vector unless otherwise specified or implied, is 

\an 

Recall that the mapping v l^ [v]e, determined by the basis {Ci}, is an isomorphism from V 
onto the space K". 

In this chapter we show that there is also an isomorphism, determined by the basis 
{ei}, from the algebra A{V) of linear operators on V onto the algebra cA of n-square matrices 
over K. 

A similar result also holds for linear mappings F:V-^U, from one space into another. 

MATRIX REPRESENTATION OF A LINEAR OPERATOR 

Let r be a linear operator on a vector space V over a field K and suppose (ei, . . . , en} is 
a basis of V. Now T{ei), . . . , r(e„) are vectors in V and so each is a linear combination of 
the elements of the basis {e,}: 

T(ei) = anCi + 01262 + • • • + oi^en 

r(e2) = 02161 + 02262 + • • • + a2n6n 



T{en) = Oniei + an2e2 + • • • + o„„e„ 
The following definition applies. 

Definition: The transpose of the above matrix of coefficients, denoted by [T]e or [T], is 
called the matrix representation of T relative to the basis {ei} or simply the 
matrix of T in the basis {et}: 

(On 021 ... Onl ' 
012 022 . . . fln2 

Om a2n ... 0, 

Example 7.1 : Let V be the vector space of polynomials in t over R of degree ^ 3, and let D : V -* V 
be the differential operator defined by D{p(t)) = d{p(t))/dt. We compute the matrix 
of D in the basis {1, t, t^, fi}. We have: 

D(l) = = + Of + 0*2 + 0*3 
D(t) = 1 = 1 + Ot + 0(2 + 0*3 
D(fi) = It = + 2t + 0f2 + 0«3 
D{fi) = 3t2 = + Ot + 3t2 + 0t3 

150 



CHAP. 7] 



MATRICES AND LINEAR OPERATORS 



151 



Accordingly, 



Example 7.2: 



[D] = 




Let T be the linear operator on R2 defined by T(x, y) — (ix — 2y, 2x + y 
pute the matrix of T in the basis {/i = (1, 1), /a = (-1, 0)}. We have 



We com- 



T(Ji) = r(l, 1) = (2,3) = 3(1, 1) + (-1, 0) = 3/1 + /2 

Tifz) = n-1, 0) = (-4, -2) = -2(1, 1) + 2(-l, 0) = -2/1 + 2/2 



(3 



3 -2 
2 



Remark: Recall that any n-square matrix A over K defines a linear operator on K" by 
the map v t^ Av (where v is written as a column vector). We show (Problem 
7.7) that the matrix representation of this operator is precisely the matrix A 
if we use the usual basis of K". 

Our first theorem tells us that the "action" of an operator T on a vector v is preserved 
by its matrix representation: 

Theorem 7.1: Let (ei, . . ., e„} be a basis of V and let T be any operator on V. Then, for 
any vector vGV, [T]e [v]e = [Tiv)]e. 

That is, if we multiply the coordinate vector of v by the matrix representation of T, 
then we obtain the coordinate vector of T{v). 

Example 7.3: Consider the differential operator D:V -^V in Example 7.1. Let 

p{t) = a+bt + cfi + dt^ and so D{p{t)) = b + 2ct + 3dt^ 
Hence, relative to the basis {1, t, t^, t^, 



[p(t)] = 



and [D(p{t))] = 



We show that Theorem 7.1 does hold here: 

'0 1 o\ 

2 
3,, 
iO o/\ 



[D][Pit)] = 



= [D(pm 



Example 7.4: Consider the linear operator T : R2 ^ R2 in Example 7.2: T{x, y) = (4a; — 2y, 2x + j 
Let V = (5, 7). Then 

V = (5,7) = 7(1, 1) + 2(-l, 0) = 7/1 + 2/2 

T{v) = (6, 17) = 17(1, 1) + 11(-1, 0) = 17/i + 11/2 

where /j = (1, 1) and fz = (-1, 0). Hence, relative to the basis {/i, /a), 



and [T(v)]f = 



11 



Using the matrix [T]f in Example 7.2, we verify that Theorem 7.1 holds here: 

<2y 



m,M, ^ (r^G) = (ID = i^<* 



152 MATRICES AND LINEAR OPERATORS [CHAP. 7 

Now we have associated a matrix [T]e to each T in A{V), the algebra of linear operators 
on V. By our first theorem the action of an individual operator T is preserved by this 
representation. The next two theorems tell us that the three basic operations with these 

operators 

(i) addition, (ii) scalar multiplication, (iii) composition 

are also preserved. 

Theorem 7.2: Let {ei, ...,e„} be a basis of V over K, and let oA be the algebra of 
«-square matrices over K. Then the mapping T h* [T]e is a vector space 
isomorphism from A{V) onto cA. That is, the mapping is one-one and onto 
and, for any S,T G A{V) and any keK, 

[T + S]e = [T]e+[S]e and [kT]e = k[T]e 

Theorem 7.3: For any operators S,T G A{V), [ST]e = [S]e [T]e. 

We illustrate the above theorems in the case dim V = 2. Suppose {ei, ez} is a basis of 
V, and T and S are operators on V for which 

T{ei) = aiei + a^ez S{ei) — CiCi + 0262 

7(62) = biei + 6262 ' 8(62) = diCi + did 

[^i-Cy - i^i- = (:;*; 

Now we have {T + S){ei) - T{ei) + S(ei) - aiCi + 0262 + ciCi + 6262 

= (tti + Ci)ei + (a2 + 62)62 

(T + S){e2) = r(e2) + 5(62) = bid + 6262 + did + d^ez 

= (&i + di)ei + (62 + ^2)62 

'^^"® 'tti + ci &i + di\ /«! &i\ , /ci di 



^ J l^ttz + ca bz + dzj [az bzj \cz dzj 

Also, for k EK, we have 

(A;r)(ei) = fcr(ei) = k{aiei + azBz) = kaiCi + ka^ez 
{kT){ez) = kTiez) = k(biei + bzez) = kbiei + kbzez 

fkai kbi\ , /ai bi\ , .-m. 

Finally, we have 

(Sr)(ei) == S(r(ei)) = S(aiei + a2e2) = aiS(ei) + a2S(e2) 

= ai{ciei + CzCz) + azidiBi + dzCz) 
= (ttiCi + a2di)ei + (aiCz + azdz)ez 

(Sr)(e2) = S{T{ez)) = S(biei + bzCz) = biS{ei) + b^Siez) 
= bi{ciei + CzCz) + bzidiBi + dzBz) 
= {biCi + bzdi)ei + {biCz + bzdz)ez 

Accordingly, 

_ /aiCi + azdi biCi + bzdi\ _ /ci dA /ai bi\ _ , ._ 
^ J' ~ [aicz + azdz biCz + bzdz) ~ [cz dz) \az bzj L>IJ« 



CHAP. 7] MATRICES AND LINEAR OPERATORS 153 

CHANGE OF BASIS 

We have shown that we can represent vectors by n-tuples (column vectors) and linear 
operators by matrices once we have selected a basis. We ask the following natural question: 
How does our representation change if we select another basis? In order to answer this 
question, we first need a definition. 

Definition: Let [ei, . . .,e„} be a basis of V and let {/i, ...,/«} be another basis. Suppose 

/i = anei + ai2C2 + • • • + ai„e„ 
/2 = aziBi + 022^2 + • • • + a2T.e„ 



fn = ttnlCi + a„2e2 + • • • + UnnCn 



Then the transpose P of the above matrix of coeflScients is termed the transi- 
tion matrix from the "old" basis {d} to the "new" basis {/{}: 

fflll ft21 . . . ftnl ' 
p _ I *12 ^'■22 . . . ffin2 

We comment that since the vectors fi, . . .,fn are linearly independent, the matrix P is 
invertible (Problem 5.47). In fact, its inverse P^Ms the transition matrix from the basis 
{/,} back to the basis {Ci}. 

Example 7.5: Consider the following two bases of R^: 

{ei = (1, 0), 62 = (0, 1)} and ih = (1, D, A = (-1. 0)} 
Then A = (1,1) = (1,0) + (0,1) = e^ + e^ 

/2 = (-1,0) = -(1,0) + 0(0,1) = -ei + 0e2 
Hence the transition matrix P from the basis {ej} to the basis {/J is 

'1 -V 

We also have e^ = (1, 0) = 0(1, 1) - (-1, 0) = O/i - /g 

62 = (0,1) = (1,1) + (-1,0) = /1+/2 

Hence the transition matrix Q from the basis {/J back to the basis {e^} is 

IN 



Q = , 
Observe that P and Q are inverses: 

We now show how coordinate vectors are affected by a change of basis. 

Theorem 7.4: Let P be the transition matrix from a basis {Ci} to a basis {fi} in a vector 
space V. Then, for any vector v G V, P[v]f = [v]e. Hence [v]f = P~'^[v]e. 

We emphasize that even though P is called the transition matrix from the old basis 
{Ci} to the new basis {/i}, its effect is to transform the coordinates of a vector in the new 
basis {fi} back to the coordinates in the old basis {ei}. 



154 MATRICES AND LINEAR OPERATORS [CHAP. 7 

We illustrate the above theorem in the case dim F = 3. Suppose P is the transition 
matrix from a basis {61,62,63} of F to a basis {fufzifa} of V; say, 

A = ttiCi + 0262 + 0363 1 0.1 bi ci\ 

/2 = biBi + 6262 + 6363 . Hence P = 02 &2 C2 

fa = C161 + 6262 + 6363 \aa ba Caj 

Now suppose V G F and, say, v = fei/i + /i;2/2 + fcs/s. Then, substituting for the /i from 
above, we obtain 

V = /i;i(aiei + a262 + a363) + fc2(biei + 6262 + 6363) + kaiciei + 0262 + 6363) 

= (aifci + bife + Cika)ei. + (azki + bzki + C2ka)e2 + {aaki + bakz + 63^3)63 

Thus /jcA jaiki + bife + Cifc3^ 

[v], = Ikz] and [v]e - a^ki + 62^2 + dka 

\lc3l \a3ki + b3k2 + cakal 

Accordingly, / ^ ^^ ^,^\ jaikr + bik2 + cM 

P[v]f - I a2 &2 62^2 = ttzifci + 62^2 + 62^3 = [V]e 

\a3 ba Caj\kaj \a3k1 + bakz + Cakaj 

Also, multiplying the above equation by P~S we have 

P-'[v]e = P-'P[v]f = I[V], = [V], 

Example 7.6: Let v - (a, b) e R2. Then, for the bases of R* in the preceding example, 

V = (a, b) = a(l, 0) + 6(0, 1) = ae^ + be^ 

V ^ (a, 6) = b{l,l) + (b-a)(-l,0) = bfi + ib-a)/^ 

Hence [v]^ =- (J^j and Mf = (ft _ „) 

By the preceding example, the transition matrix P from {ej to {/J and its inverse 
p-i are given by 

I'D - '■- = (-: : 

We verify the result of Theorem 7.4: 



-M. = (_: DC) = (.!.) = .. 



The next theorem shows how matrix representations of linear operators are affected 
by a change of basis. 

Theorem 7.5: Let P be the transition matrix from a basis {Ci} to a basis {/i} in a vector 
space V. Then for any linear operator T on F, [T]t = P-i[T]eP. 

Example 7.7: Let T be the linear operator on E^ defined by T(x, y) = (4a; - 2j/, 2a; + j/). Then for 
the bases of R^ in Example 7.5, we have 

r(ei) = r(l, 0) := (4, 2) = 4(1, 0) + 2(0, 1) = 4ei + 2e2 
ne^) = r(0,l) = (-2,1) = -2(1,0) + (0,1) = -2ei + e2 



Accordingly, [T]e 



/4 -2 

V2 1 



CHAP. 7] MATRICES AND LINEAR OPERATORS 155 

We compute [T]f using Theorem 7.5: 

m, - p-i... = (_: ixi -ixi -I) - (i -I 

Note that this agrees with the derivation of [T]f in Example 7.2. 

Remark: Suppose P - (Oij) is any «-square invertible matrix over a field K. Now if 
{ei, .... en} is a basis of a vector space V over K, then the n vectors 

/i = aiiCi + 02162 + • • • + a„ie„, i=l, . . .,n 

are linearly independent (Problem 5.47) and so form another basis of V. 
Furthermore, P is the transition matrix from the basis {«{} to the basis {/{}. 
Accordingly, if A is any matrix representation of a linear operator T on V, 
then the matrix B = P~^AP is also a matrix representation of T. 

SIMILARITY 

Suppose A and B are square matrices for vs^hich there exists an invertible matrix P 
such that B = P~^AP. Then B is said to be similar to A or is said to be obtained from A 
by a similarity transformation. We show (Problem 7.22) that similarity of matrices is an 
equivalence relation. Thus by Theorem 7.5 and the above remark, we have the following 
basic result. 

Theorem 7.6: Two matrices A and B represent the same linear operator T if and only if 
they are similar to each other. 

That is, all the matrix representations of the linear operator T form an equivalence 
class of similar matrices. 

A linear operator T is said to be diagonalizable if for some basis (Ci} it is represented 
by a diagonal matrix; the basis {«{} is then said to diagonalize T. The preceding theorem 
gives us the following result. 

Theorem 7.7: Let A be a matrix representation of a linear operator T. Then T is 
diagonalizable if and only if there exists an invertible matrix P such that 
P~^AP is a diagonal matrix. 

That is, T is diagonalizable if and only if its matrix representation can be diagonalized 
by a similarity transformation. 

We emphasize that not every operator is diagonalizable. However, we will show 
(Chapter 10) that every operator T can be represented by certain "standard" matrices 
called its normal or canonical forms. We comment now that that discussion will require 
some theory of fields, polynomials and determinants. 

Now suppose / is a function on square matrices which assigns the same value to similar 
matrices; that is, f{A) = f{B) whenever A is similar to B. Then / induces a function, also 
denoted by /, on linear operators T in the following natural way: f{T) = f{[T]e), where {d} 
is any basis. The function is well-defined by the preceding theorem. 

The determinant is perhaps the most important example of the above type of functions. 
Another important example follows. 

Example 7.8: The trace of a square matrix A = (oy), written tr (A), is defined to be the sum of 
its diagonal elements: 

tr (A) = an + 022 + • • • + a„„ 

We show (Problem 7.22) that similar matrices have the same trace. Thus we can 
speak of the trace of a linear operator T; it is the trace of any one of its matrix 
representations: tr {T) = tr ([T]g). 



156 MATRICES AND LINEAR OPERATORS [CHAP. 7 

MATRICES AND LINEAR MAPPINGS 

We now consider the general case of linear mappings from one space into another. 
Let V and U be vector spaces over the same field K and, say, dim V = m and dim U = n. 
Furthermore, let {ei, . . . , em} and {/i, ...,/«} be arbitrary but fixed bases of V and U 
respectively. 

Suppose F:V^U is a linear mapping. Then the vectors F{ei), . .., F{em) belong to 
U and so each is a linear combination of the fc 

F{ei) = aii/i + ai2/2 + • • • + ai„fn 

F{e2) = ttzi/i + 022/2 + • • • + aznfn 



F{em) — ttml/l + dmifl + ■ ' • + Ctmn/n 

The transpose of the above matrix of coefficients, denoted by [F]l is called the matrix 
representation of F relative to the bases {ei} and {ft}, or the matrix of F in the bases {ec} 
and {/i}: 

/ ftll ft21 . . . ami 

rmf _ ^^12 CI22 . . • ttm2 
L^Je - 

\ din 0/2n . . • dmn 

The following theorems apply. 

Theorem 7.8: For any vector v GV, [F]l [v]e = [F{v)],. 

That is, multiplying the coordinate vector of v in the basis (ei} by the matrix [F]l, we 
obtain the coordinate vector of F{v) in the basis {fi). 

Theorem 7.9: The mapping F ^ [F]f is an isomorphism from Hem (V, U) onto the vector 
space of % X m matrices over K. That is, the mapping is one-one and onto 
and, for any F, G G Horn {V, U) and any fc e K, 

[F + G]f = [F]i + [G]/ and [kF]f = k[F]i 

Remark: Recall that any nxm matrix A over K has been identified with the linear map- 
ping from K'" into Z" given by v M' Av. Now suppose V and U are vector 
spaces over K of dimensions m and n respectively, and suppose {e;} is a basis 
of V and {fi} is a basis of U. Then in view of the preceding theorem, we shall 
also identify A with the linear mapping F:V^U given by [F{v)]f = A[v]e. We 
comment that if other bases of V and U are given, then A is identified with 
another linear mapping from V into U. 

Theorem 7.10: Let {ei}, {fi} and {Qi} be bases of V, U and W respectively. Let F:V-*U 
and G:U-*W be linear mappings. Then 

[GoFYe = [GYfWVe 

That is, relative to the appropriate bases, the matrix representation of the composition 
of two linear mappings is equal to the product of the matrix representations of the 
individual mappings. 

We lastly show how the matrix representation of a linear mapping F:V-*U is affected 
when new bases are selected. 

Theorem 7.11: Let P be the transition matrix from a basis {ei} to a basis (e,'} in V, and let 
Qbe the transition matrix from a basis {/i} to a basis {//} in [/. Then for 

any linear mapping F:V ^ U, 

[Ft = Q-'inp 



CHAP. 7] 



MATRICES AND LINEAR OPERATORS 



157 



Thus in particular, 

i.e. when the change of basis only takes place in JJ; and 

[F]l. = [F]iP 
i.e. when the change of basis only takes place in V. 

Note that Theorems 7.1, 7.2, 7.3 and 7.5 are special cases of Theorems 7.8, 7.9, 7.10 
and 7.11 respectively. 

The next theorem shows that every linear mapping from one space into another can be 
represented by a very simple matrix. 

Theorem 7.12: Let F:V-*U be linear and, say, rankF = r. Then there exist bases of 
V and of V such that the matrix representation of F has the form 



A = 



I 




where / is the r-square identity matrix. We call A the normal or canonical 
form of F. 

WARNING 

As noted previously, some texts write the operator symbol T to the right of the vector 
V on which it acts, that is, 

vT instead of T{v) 

In such texts, vectors and operators are represented by n-tuples and matrices which are the 
transposes of those appearing here. That is, if 



then they write 



felCl + feez + • • • + knCn 



[v]e = (A;i, fe, . . ., kn) instead of [v]e = 



And if 



then they write 

[T]e = 



'tti Oi 

bi b2 



lCi C2 



r(ei) = aiei + aid + • • • + a„en 
T{e2) = 6iei + 6262 + • • • + &ne„ 

r(e„) = ciei + 0262 + • • • + c„e„ 



instead of [T]e = 




This is also true for the transition matrix from one basis to another and for matrix rep- 
resentations of linear mappings F:V ^ U. We comment that such texts have theorems 
which are analogous to the ones appearing here. 



158 MATRICES AND LINEAR OPERATORS [CHAP. 7 

Solved Problems 

MATRIX REPRESENTATIONS OF LINEAR OPERATORS 

7.1. Find the matrix representation of each of the following operators T on R^ relative to 
the usual basis {ei = (1, 0), 62 = (0, 1)}: 

(i) T{x, y) = {2y, Sx - y), (ii) T{x, y) = (3x -4y,x + 5y). 

Note first that if (a, b) S R2, then (a, b) = ae^ + be^. 

(i) r(ei) = r(l,0) = (0,3) = 0ei + 3e2 ^ .^, /O 2 

and rri. = ( „ 

T{e^) = T{Q,1) = (2,-1) = 2ei- e^ \S -1 

(ii) r(ei) = r(l,0) = (3,1) = 3ei+ 62 /3 -4 

and [rig = ( 

Tie^) = r(0,l) = (-4,5) = -461 + 562 \1 5 



7.2. Find the matrix representation of each operator T in the preceding problem relative 
to the basis {A = (1,3), /a = (2, 5)}. 

We must first find the coordinates of an arbitrary vector (a, b) G K^ with respect to the basis 

{/J. We have 

(a, b) = x(l, 3) + 2/(2, 5) = (x + 2y, Zx + 5y) 

or X + 2y = a and Sx + 5y = b 

or a; = 26 — 5a and y ~ 3a — 6 

Thus (a, 6) = (26 - 5a)/i + (3a - 6)/2 

(i) We have T{x, y) = (2y, Sx - y). Hence 

r(/i) = r(l,3) = (6,0) = -3O/1 + I8/2 

r(/2) = r(2, 5) = (10, 1) = -48/i + 29/2 

(ii) We have T(x, y) = (3x — 4y,x + 5y). Hence 

r(/i) = r(l,3) = (-9,16) = llfi-ASf^ 
T(h) = r(2,5) = (-14,27) = 124/1-69/2 



and [T]f = 



and [T]f 



30 


-48 


18 


29 


77 


124 


-43 


-69 



ai 


a2 


as 


61 


&2 


bs 


Ci 


C2 


Cs 



7.3. Suppose that T is the linear operator on R^ defined by 

T{x, y, z) = (ttiic + a2.y + aaz, bix + h^y + bsz, cix + dy + Cs^) 

Show that the matrix of T in the usual basis (ei} is given by 

[T]e - 

That is, the rows of [T]e are obtained from the coefficients of x, y and z in the com- 
ponents of T{x, y, z). 

T{ei) = T{1, 0, 0) = (ai, 61, Ci) = a^ei + b^ez + c^e^ 
7(62) = T(0, 1, 0) = (02, 62, C2) = 0361 + 6262 + 6363 
7(63) = r(0, 0, 1) = (aa, 63, C3) = agei + 6363 + 6363 

Accordingly, /ai 03 aaX 

lT]e = (h 62 63 1 

Remark: This property holds for any space K'^ but only relative to the usual basis 

{ei = (l, 0, ...,0), 62 = (0,1,0, ...,0), ..., e„ = (0, ...,0,1)} 




CHAP. 7] MATRICES AND LINEAR OPERATORS 159 

7.4. Find the matrix representation of each of the following linear operators T on R^ 
relative to the usual basis (ei = (1, 0, 0), 62 = (0, 1, 0), 63 = (0, 0, 1)}: 

(i) T{x,y,z) - {2x-Zy + Az,5x-y + 2z,Ax + ly), 

(ii) T{x,y,z) = {2y + z,x-4:y,Zx). 

By Problem 7.3: (i) [T]^ = | 5 -1 2 ) , (ii) [T]^ = \ 1 -4 

7.5. Let T be the linear operator on R^ defined by T(x, y, z) = {2y + z,x — Ay, 3a;). 

(i) Find the matrix of T in the basis {/i = (1, 1, 1), /a = (1, 1, 0), fa = (1, 0, 0)} 

(ii) Verify that [T], [v]f = [T{v)]s for any vector v G R^. 

We must first find the coordinates of an arbitrary vector (a, h, c) G R^ with respect to the basis 
{fvfz'fai- Write {a,b,c) as a linear combination of the /j using unknown scalars x, y and z: 

(a, b, c) = x{l, 1, 1) + y{l, 1, 0) + z(l, 0, 0) 
= (x + y + z,x + y,x) 
Set corresponding components equal to each other to obtain the system of equations 

X + y + z = a, X + y = b, x = c 
Solve the system for x, y and z in terms of a, b and c to find x = c, y = b ~e, z = a — b. Thus 

(a, 6, c) = c/i + (6 - c)/2 + (a - 6)/3 

(i) Since T(x,y,z) = (2j/ + z, a; - 4j/, 3a;) 

r(/i) = r(l,l,l) = (3,-3,3) = 3/1-6/2 + 6/3 

r(/2) = r(l,l,0) = (2,-3,3) = 3/1-6/2 + 5/3 and [T]f 

T{fa) = r(l,0,0) = (0,1,3) = 3/1-2/2- /a 

(ii) Suppose v ~ (a, b, c); then 

V = (a,b,c) = c/i + (6 — c)/2 + (a — 6)/3 and so [v] 



Also, 

T{v) = T(a, b, c) = (26 + c, a - 46, 3(i) 

= 3a/i + (-2a - 46)/2 + (-0 + 66 + c)/3 
Thus 



and so [T{v)]f 






[T]f[v]f = -6 -6 -2 6-c = -2a-45 = lT(v)]f 



7.6. Let A - ( j and let T be the linear operator on R^ defined by T{v) = Av (where 

V is written as a column vector). Find the matrix of T in each of the following bases: 
(i) {ei = (1, 0), 62 = (0, 1)}, i.e. the usual basis; 
(ii) {/i = (l,3), /2 = (2,5)}. 

(i) Tie,) =(l 2yi\ ^ (I) = u, + 3e, 



\3 4/\0/ ^3/ - - /I 2 

^(^^)={3 4)(l) =(:) =2ex + 4e2 



160 



MATRICES AND LINEAR OPERATORS 



[CHAP. 7 



Observe that the matrix of T in the usual basis is precisely the original matrix A which 
defined T. This is not unusual. In fact, we show in the next problem that this is true for any 
matrix A when using the usual basis. 



(ii) By Problem 7.2, (o, 5) = (26 - 5o)/i + (3o - h)!^. Hence 

^<« = (a X) = © = -'-'■ 



and thus 



[r]/ 



/-5 -8N 

V 6 loy 



7.7. Recall that any «-square matrix A = {an) may be viewed as the linear operator T on 
K" defined by T{v) = Av, where v is written as a column vector. Show that the 
matrix representation of T relative to the usual basis {et} of K" is the matrix A, that 

is, [T]e = A. 

I Oil Ol2 • •• «lJl \/ 1 



T(ei) = Aei = 



T(ei) 



Aea = 




fail 



— OiiBi + 021^2 + 



+ a„ie„ 



<»12 

aji2j 



+ 



+ «n2 



Oil <*12 

r(e„) = Ae„ = I «2i «22 





(That is, r(e,) = Ae, is the ith column of A.) Accordingly, 

me = 



0.11 %2 • • • ''itl \ 
021 *22 • • • <*2n 



, Onl <'^n2 



a. 



nn J 



«ln«l + 02n«2 + ••• + Onnen 



= A 



7.8. Each of the sets (i) {l,t,e\te*) and (ii) {f^\t^*,t^e^*} is a basis of a vector space V 
of functions / : R -» R, Let D be the differential operator on V, that is, D{f) — df/dt. 
Find the matrix of D in the given basis. 



(i) I>(1) = = 0(1) + 0(t) + 0(e«) + O(te') 

I?(t) = 1 = 1(1) + 0(t) + 0(et) + O(te') 

£>(et) = e* = 0(1) + 0(t) + l(eO + 0(<et) 

i)(«e«) = e« + te* = 0(1) + 0(t) + l(e«) + l(tet) 



and 



[D] = 



(ii) 2)(e3«) = 3e»« = 3(e30 + 0(«e3') + 0(t263t) 

D(<e30 = eSt + 3«e8t = l(e30 + 3(«e3t) + 0(t2e3t) 
2)(t2e3t) = 2teS» + Bt^e^* = 0(e3«) + 2(«e3') + 3(t2e3') 



and 



[D] = 




CHAP. 7] MATRICES AND LINEAR OPERATORS 161 

7.9. Prove Theorem 7.1: Suppose {ei, . . .,e„} is a basis of V and T is a linear operator 
on F. Then for any vGV, [T]e [v]e = [T{v)]e. 

Suppose, for i — 1, . . .,n, 

n 

Tiei) = Bjiei + ffljaea + • • • + Oi„e„ = 2 %«} 
Then [r]e is the n-square matrix whose ith row is 

n 

Now suppose V = k^ei + kzBz + • • • + fc„e„ = 2 K^i 

Writing a column vector as the transpose of a row vector, 

[v], = (&i, fcj, ...,fe„)t (2) 

Furthermore, using the linearity of T, 

T{v) = T (^ 2 he^ = 2 hned = 2 fci (^ 2 a«e,- 

n / n \ n 
= 2(2 ««*> ) «j = 2 (oij-fci + a2jfc2 ^ h a„^fe„)ej 

j— 1 \ i=l / i— 1 

Thus [r(i;)]g is the column vector whose jth entry is 

aijfci + a^jk^ + • • ■ + a„j&„ (^) 

On the other hand, the ith entry of [r]e[^]e is obtained by multiplying the ;th row of [T\g by [v]^, 
i.e. (1) by (2). But the product of (1) and (2) is (3); hence [r]c[v]e and [T(v)\g have the same entries. 
Thus [T], [v], = [T(v%. 



7.10. Prove Theorem 7.2: Let {ei, . . . , e„} be a basis of V over X^, and let cA be the algebra 
of %-square matrices over K. Then the mapping T ^ [T]e is a vector space isomor- 
phism from A{V) onto cA. That is, the mapping is one-one and onto and, for any 
S,T& A{V) and any kGK, [T + S\e = {T]e + [S\e and [kT]e = k[T]e. 

The mapping is one-one since, by Theorem 8.1, a linear mapping is completely determined by 
its values on a basis. The mapping is onto since each matrix M & cA is the image of the linear 
operator ^ 

F(e^ - 2 »»«e^ i = l,...,n 

i=l 

where (wy) is the transpose of the matrix M. 

Now suppose, for i = 1, . . . , w, 

n n 

T{eO = 2 cmej and S{ei) = 2 SijCj 
i=i i=i 

Let A and B be the matrices A = (ay) and B = (6y). Then [r]^ = A* and [5]^ = B*. We have, 
for i = 1, ...,%, „ 

(r + SKej) = T(ei) + S{ei) = 2K + 6«)ej 

Observe that A + J? is the matrix (ay + 6y). Accordingly, 

[T + S], = (A+B)t = A' + fit = [r],+ [S]e 
We also have, for i = 1, .. .,n, 

n n 

(fcrXej) = k T(et) = fc 2 ayej = 2 ikaij)ej 

Observe that kA is the matrix (fcay). Accordingly, 

[kT], = (kA)t = kAt = k[T], 
Thus the theorem is proved. 



162 MATRICES AND LINEAR OPERATORS [CHAP. 7 

7.11. Prove Theorem 7.3: Let {ei, . . . , e„} be a basis of V. Then for any linear operators 

S,Te A{V), [ST]e = [S]e [T]e. 

n n 

Suppose r(ej) = 2 "ij^j and S(ej) = 2 6jk«/c. Let A and B be the matrices A = (ay) and 
1=1 fc=l 

B = (bjk). Then [T]^ = A* and [S]^ - B*. We have 

(ST)iei) = S(7'(ei)) = sCSoije,) = 2 a«S(e,) 

\i— 1 / i— 1 

n/« \ n / n \ 

= 2 a« ( 2 6ifc6fc ) = 2(2 aijftjic «k 

i = l \fc = l / IC=1 \3 = 1 / 

n 

Recall that AB is the matrix AB = (cjfc) where Cj^ = 2 "■iibjk- Accordingly, 

J — 1 

[ST], = (AB)t = B*At = [S]AT]e 

CHANGE OF BASIS, SIMILAR MATRICES 

7.12. Consider these bases of R^: {ei = (1,0), cz = (0,1)} and {/i ^ (1,3), /2 = (2,5)}. 
(i) Find the transition matrix P from {ei} to {/i}. (ii) Find the transition matrix Q 
from {/i} to {ei}. (iii) Verify that Q = P'K (iv) Show that [vy = P-^[v]e for any 
vector V eR^ (v) Show that [T]f = P-'[T]eP for the operator T on R^ defined by 
T{x, y) = {2y, Sx - y). (See Problems 7.1 and 7.2.) 

(i) /i = (1,3) = lei + 362 ^^^ p = /^ ^ 



/2 = (2,5) = 261 + 562 \3 5 

(ii) By Problem 7.2, (a, 6) = (26 - 5a)/i + (3a - 6)/2. Thus 

61 = (1,0) = -5/1 + 3/2 ^^^ Q^/-5 2 

62 = (0,1) =2/1-/2 V 3 -1 

(-) ^« = (3 5)(~3 -1) = Co 1) " ' 

/a\ /26-5a\ 

(iv) If i; = (a,6), then Me=(j,) and M/ = ( g^-^)- Hence 



P-'\vl 



-5 2\/a\ _ /-5a +26 
3 -1A6/ ~ I 3a -6 



/O 2\ /-30 -48 \ 

(v) By Problems 7.1 and 7.2; ['ne=(g_^) and [T]f = (^13 29 j" ^""^ 

7.13. Consider the following bases of R«: {ei = (1,0,0), 62 = (0,1,0), 63 = (0,0,1)} and 
{/i = (1,1,1), /2 = (1,1,0), /3 = (1,0,0)}. (i) Find the transition matrix P f rom {ei} 
to {/i}. (ii) Find the transition matrix Q from {A} to {ei}. (iii) Verify that Q = P \ 
(iv) Show that [v]/ = P-^[v]e for any vector v G R^ (v) Show that [T]f = P ^[T]eP 
for the T defined by T{x, y, z) = {2y + z,x- Ay, 3a;). (See Problems 7.4 and 7.5.) 

(i) /l = (1,1,1) = Iei+l62+l63 

/a = (1, 1, 0) = lei + 1^2 + Oeg and P = 

fs = (1,0,0) = lei + 062 + 063 




CHAP. 7] MATRICES AND LINEAR OPERATORS 163 

(ii) By Problem 7.5, (a, b, c) = cf^ + (b - c)/2 + (a - b)fs. Thus 

ei = (1,0,0) = 0/1 + 0/2 + 1/3 /o 1^ 

62 = (0,1,0) = 0/1 + 1/2-1/3 and Q = 1-1 

63 = (0,0,1) = 1/1-1/2 + 0/3 \l -1 0; 

'1 1 iWo 1^ 
(iii) PQ ^ \l 1 1 -1 I = 

,1 0/\l -1 0^ 

(iv) It v = (a, 6, c), then [v], = \b\ and [v]f = U _ ^ ) . Thus 

\a- bi 





/O 2 1\ / 3 3 3\ 

(v) By Problems 7.4(ii) and 7.5, [7]^ = h -4 and [T\f = -6 -6 -2 . Thus 

\3 0/ \ 6 5 -1/ 

/O l\/o 2 l\/l 1 l\ / 3 3 3\ 
P-^[T\eP =0 1-11-4 1 1 = -6 -6 -2 = m, 
\l -1 0/\3 o/\l 0/ \ 6 5 -1/ 

7.14. Prove Theorem 7.4: Let P be the transition matrix from a basis {cj} to a basis {h) 
in a vector space V. Then for any vGV, P[v]f = [v]e. Also, [v]f = P-^[v]e. 

n 

Suppose, for i=l,...,n, A = ajiei + 04262 + • • • + aj„e„ = 2 ^0^. Then P is the «-square 
matrix whose jth row is j=i 

(oij, a2j, .... a„j) (i) 

n 

Also suppose V - kj^ + k2f2+ ■■■ + kj„ = 2 Vj- Then writing a column vector as the 
transpose of a row vector, *-i 

[V]f = (fci, ^2, ...,fc„)t (2) 

Substituting for /j in the equation for v, 

V = 2v, = 2^i(i««e,) = i(|«iA)«i 

n 

= 2 (aijfci + a2jk2 + • • • + a„jkn)ej 

Accordingly, [11]^ is the column vector whose jth entry is 

aijfci + a2jfc2 + • • • + a„jk„ (s) 

On the other hand, the yth entry of Plv]f is obtained by multiplying the ith row of P by [vh, i.e. 
(1) by (2). But the product of (1) and (2) is (5); hence P[v]f and [v]^ have the same entries and thus 

PMf = Me- 

Furthermore, multiplying the above by P-i gives P~^[v]e = P-iP[v]f = [v]f. 

7.15. Prove Theorem 7.5: Let P be the transition matrix from a basis {d} to a basis {/i} in 
a vector space F. Then, for any linear operator T on V, [T]t = P-i [T]eP. 

For any vector vGV, P-HT]^P[v]f = P-^[T],[v], = p-i[T(v)]^ = [T(v)]f. 



164 MATRICES AND LINEAR OPERATORS [CHAP. 7 

But [T]f[v]f = [T{v)]f; hence P-^[T],P[v]f = [T],[v]f. 

Since the mapping v l-» [v]f is onto K», P-i[T]^PX = [T]fX for every X £ iC«. 

Accordingly, P-i[r],P = [7]^. 

7.16. Show that similarity of matrices is an equivalence relation, that is: (i) A is similar 
to A; (ii) if A is similar to B, then B is similar to A; (iii) if A is similar to B and B is 
similar to C then A is similar to C. 

(i) The identity matrix / is invertible and / = /"i. Since A = I-^AI, A is similar to A. 

(11) Since A is similar to B there exists an invertible matrix P such that A = P-^BP. Hence 
B = PAP-i = (P-i)-»AP-i and P^^ is invertible. Thus B is similar to A. 

(iii) Since A is similar to B there exists an invertible matrix P such that A = p-iPP, and since 
B is similar to C there exists an invertible matrix Q such that B = Q-^CQ. Hence A = 
p-iBP = P-^(Q-^CQ)P = (QP)->C(QP) and QP is invertible. Thus A is similar to C. 



TRACE 

7.17. The trace of a square matrix A = (oij), written tr (A), is the sum of its diagonal 

elements: tr (A) = an + ^ + • • • + a„„. Show that (i) tr (AB) = tr (BA), (ii) if A 

is similar to B then tr (A) = tr (B). 

n 

(1) Suppose A = (a„) and B = (fty). Then AP = (ci^) where Cj^ = ^ aij&jfc- Thus 

n n n 

tr(AP) = 2 Cii = 2 2 ttyfeji 
i=l i=l }=1 

n 

On the other hand, BA = (d^^) where dj^ = 2 6ji«Hic- Thus 

i=l 

tr(PA) = 2 dji = 2 2 6ii«« = 2 2aa6;i = tr(AP) 

j = l 3=1 i=l >=1 5 = 1 

(ii) If A is similar to B, there exists an invertible matrix P such that A = P-^BP. Using (i), 

tr(A) = tr(P-iPP) = tr (PPP-i) = tr (P) 



7.18. Find the trace of the following operator on R^: 

T{x, y, z) = (aiflj + a2y + a^z, bix + h^y + hsz, Cix + Czy + csz) 

We first must find a matrix representation of T. Choosing the usual basis {ej, 

h 62 &3 

Cl C2 C3 / 

and tr (T) = tr ([T],) = di + 63 + C3- 

7.19. Let V be the space of 2 x 2 matrices over R, and let Af = f g ^ j . Let T be the linear 
operator on V defined by T{A) = MA. Find the trace of T. 

We must first find a matrix representation of T. Choose the usual basis of V: 



CHAP. 7] MATRICES AND LINEAR OPERATORS 165 



Then 



nS,) = ME, = (^l ^)(J °) = (^^ °) = IE, + OE, + 3^3 + 0^4 
nE,) = ME, = ^g ^^(^J J^ = ^J J) = 0^1 + IE, + OE, + SE, 
T(E,) = ME, = (I ^Y° I) ^ (^ I) ::. 2E, + OE, + 4E, + OE, 



3 4/Vl 0/ V4 



/I 2\/0 ON /O 2\ 

T(E,) = ME^ = (^3 J(^^ J = (^^ J = OE, + 2E, + OE, + 4E^ 



Hence /l 2 0\ 

10 2 



[T]e = 
and tr (T) = 1 + 1 + 4 + 4 = 10. 



3 4 
\0 3 4^ 



MATRIX REPRESENTATIONS OF LINEAR MAPPINGS 

7:20. Let if : R3 ^ R2 be the linear mapping defined by F{x, y, z) = {Sx + 2y-4z,x-5y + Sz). 

(i) Find the matrix of F in the following bases of R* and R^: 

{ft = (1, 1, 1), h = (1, 1, 0), fs = (1, 0, 0)}, {9i = (1, 3), g, = (2, 5)} 

(ii) Verify that the action of F is preserved by its matrix representation; that is, for 
any vGR^ [F]nv]f = [F{v)], . 

(i) By Problem 7.2, (a, b) = (26 - 5a)ffi + (3a - b)g2. Hence 
F(/i) - F(l,l,l) = (1,-1) = -7g,+ 4g, 

7 QQ 1 Q \ 

4 19 8/ 
F(fa) = F(1,0,0) = (3,1) = -135^1+ »ff2 

(ii) If v-(x,y,z) then, by Problem 7.5, v - zf, + {y - z)/, + {x - y)/,. Also, 

F(v) = (Sx + 2y-4z,x-5y + 3z) = (-13a; - 20y + 26z)gi + (8a; + lly - 15z)g2, 





, r„, , /-13a; - 2O3/ + 26z \ 

and [i^(-)]a - ( 8, + 11/- 15. )• T^^ 



7 -33 -13 \/ ^ /-13a; - 20j/ + 26«\ ,„, ^, 

4 19 SJU-H == (8a; + ll.-15. ) = t^(^>^' 



7.21. Let F:R^-^K^ be the linear mapping defined by 

F{Xi, X2, . . . , Xn) = {anXi + • • • + amXn, 021X1 + • • • + aanXn, . . . , OmlXi + • • • + amnXn) 

Show that the matrix representation of F relative to the usual bases of K" and of K" 
is given by 

(ttu ai2 ... ttln ' 

0,21 CI22 ... (l2n 

flml (tm2 • . . dmnl 



166 MATRICES AND LINEAR OPERATORS [CHAP. 7 

That is, the rows of [F] are obtained from the coefficients of the Xi in the components 
of F{xi, . . ., x„), respectively. 



^(1,0 0) = (ail, aai, ...,(i„i) /«n «12 •■• «in 

F{0, 1, . . . , 0) = (ai2, a22' ••■> "m2) , rpi - I "21 «22 • • • «a 

F{0,0, . . ., 1) = (ai„, (l2r» • • •> "rnn) y^ml «m2 • • • «tr 




7.22. Find the matrix representation of each of the following linear mappings relative to 
the usual bases of R": 

(i) F : R2 ^ R3 defined by F{x, y) = {Zx -y,2x + 4y, 5x - ey) 

(ii) F : R* ^ R2 defined by F{x, y, s, t) = (3a; -4:y + 2s-U,hx + ly-s- 2t) 

(iii) F : R3 ^ R* defined by F{x, y, z) = {2x + Zy-%z,x + y + z. Ax - 5z, &y) 

By Problem 7.21, we need only look at the coefficients of the unknowns in F{x, y, . . .). Thus 

(2 3 g\ 
6 0/ 

7.23. Let T:R2^R2 be defined by T{x,y) = (2x-Zy,x + Ay). Find the matrix of T in 
the bases {ei = (1, 0), 62 = (0, 1)} and {A = (1,3), ^ = (2,5)} of R^ respectively. 
(We can view T as a linear mapping from one space into another, each having its 
own basis.) 

By Problem 7.2, {a,b) = (26-5o)/i + (3a-6)/2. Then 

r(ei) = r(l,0) = (2,1) = -8/1+ 5/2 ^^^ f ^ /-8 23 

Tie^) = r(0,l) = (-3,4) ^ 23/1-13/2 ^'^ ' \ 5 "13 

/ 2 5 —3 \ 

7.24. Let A = ( . Recall that A determines a linear mapping F-.W^B? de- 

\1 -4 7/ 

fined by F(v) = Av where v is written as a column vector. 

(i) Show that the matrix representation of F relative to the usual basis of R^ and 

of W is the matrix A itself: [F] = A. 
(ii) Find the matrix representation of F relative to the following bases of R^ and R*. 

{/i = (1, 1, 1), U = (1, 1, 0), h = (1, 0, 0)}, {^1 = (1, 3), g^ = (2, 5)} 



(i) 



F(1,0,0) = (1 _4 ^) = (1) = 261 + 162 



F(0,1,0) = (j _J ^) 1 = (_J) - 561-462 



/ 2 5 3 \ 

from which W\ = { -, _. „ ) = -A- (Compare with Problem 7.7.) 



(ii) By Problem 7.2, (a, 6) = (26 - 5a)flri + (Za-V^g^. Then 



CHAP. 7] 



MATRICES AND LINEAR OPERATORS 



167 



and [F]« 



F(h) = 


(I 


5 

-4 


-?)(; 


F{f2) = 


il 


5 
-4 




F(h) = 


il 


5 
-4 




-12 -41 


~^^ 






8 24 


5 ■ 







^( 



:) - 


-- -12flri + 8^2 


4) = 


= -41flri + 24fir2 


I) - 


'- -SfiTi + 5fr2 



7.25. Prove Theorem 7.12: Let F:y-*J] be linear. Then there exists a basis of Y and a 

basis of C7 such that the matrix representation A of i^ has the form A = ( 
where / is the r-square identity matrix and r is the rank ofF. \ 

Suppose dim V - m, and dim JJ = n. Let W be the kernel of F and V the image of F. We 
are given that rank F = r\ hence the dimension of the kernel of F is m — r. Let {wi, . . .,«)„_ J 
be a basis of the kernel of F and extend this to a basis of V: 

Set Ml = F('Ui), M2 = ^(va), ..., Mr = F(t)^) 

We note that {mi, . . . , m J is a basis of J7', the image of F. Extend this to a basis 

{%, . . ., M„ Mr+l, . . ., M„} 



of J7. Observe that 

F(t;i) = Ml = 1mi + 0^2 + 
^(va) = M2 — 0*<i + 1^2 + 



+ Om, + Om^+i + • 
+ Om^ + Om^+i + • 



F(i;,) = M, = Omi + 0^2 + 

F(Wi) =0 = Omi + OM2 + 



+ 1m^ + OMy+ 1 + 



F(w^_r) = = 0% + OM2 + • • • + OWr + Om^+i + 

Thus the matrix of F in the above bases has the required form. 



+ 0m„ 
+ 0m„ 



+ 0m„ 
+ Om„ 



+ 0m„ 



Supplementary Problems 



MATRIX REPRESENTATIONS OF LINEAR OPERATORS 

7.26. Find the matrix of each of the following linear operators T on R2 with respect to the usual basis 
{ei = (1, 0), 62 = (0, 1)}: (i) r(», y) = (2x -3y,x + y), (ii) T{x, y) = (5x + y,Zx- 2y). 

7.27. Find the matrix of each operator T in the preceding problem with respect to the basis {/i = (1, 2), 
/2 = (2, 3)}. In each case, verify that {T\f{v\f - {T(v)]f for any v e R2. 



7.28. Find the matrix of each operator T in Problem 7.26 in the basis {g^ = (1, 3), g^ = (1, 4)}. 



168 MATRICES AND LINEAR OPERATORS [CHAP. 7 

7.29. Find the matrix representation of each of the following linear operators T on R3 relative to the 
usual basis: 

(i) T(x,y,z) = (x,y,0) 

(ii) T{x, y, z) = (2x -7y - Az, 3x + y + 4z, 6a; - 83/ + z) 

(iii) T(,x, y,z) = (z,y + z, x + y + z) 

7.30. Let D be the differential operator, i.e. D(f) — dfldt. Each of the following sets is a basis of a 
vector space V of functions / : E ^ R. Find the matrix of D in each basis: (i) {e', e^t, te^'^}, 
(ii) {sin t, cos t}, (iii) {e5«, te^*, t^e^t}, (iv) {1, t, sin St, cos 3t}. 

7.31. Consider the complex field C as a vector space over the real field E. Let T be the conjugation 
operator on C, i.e. T(z) = z. Find the matrix of T in each basis: (i) {1, i), (ii) {1 + i, 1 + 2i}. 

7.32. Let V be the vector space of 2 X 2 matrices over R and let Af = ( ) . Find the matrix of each 



of the following linear operators T on V in the usual basis (see Problem 7.19) of V: (i) T{A) = MA, 
(ii) T{A) = AM, (iii) T(A) =^MA- AM. 

7.33. Let ly and Oy denote the identity and zero operators, respectively, on a vector space V. Show that, 
for any basis {ej of V, (i) [1^]^ = I, the identity matrix, (ii) [Oy]^ = 0, the zero matrix. 

CHANGE OF BASIS, SIMILAR MATRICES 

7.34. Consider the following bases of R^: {e^ - (1, 0), eg = (0, 1)} and {/i = (1, 2), ^ = (2, 3)}. 

(i) Find the transition matrices P and Q from {gj} to {/J and from {/j} to {ej, respectively. 
Verify Q = P-i. 

(ii) Show that [v]^ = P[v]f for any vector v G fP. 

(iii) Show that [T]f - P-'^[T]^P for each operator T in Problem 7.26. 

7.35. Repeat Problem 7.34 for the bases {/i = (1,2), /a = (2,3)} and {g^ = (1,3), g^ = (1.4)}. 

7.36. Suppose {e^, e^} is a basis of V and T :V -^V is the linear operator for which T^e^) = Se^ — 2e2 
and T{e2) = ej + 4e2. Suppose {/i, /a} is the basis of V for which /i = ei + e^ and /z = 2ei + 3e2. 
Find the matrix of T in the basis {/i, /j}. 

7.37. Consider the bases B — {1, i} and B' = {1 + i, 1 + 2i} of the complex field C over the real field 
R. (i) Find the transition matrices P and Q from B to B' and from B' to B, respectively. Verify 
that Q = P-\ (ii) Show that [T]^, = P-'^[T]bP for the conjugation operator T in Problem 7.31. 

7.38. Suppose {ej, {/J and {flrj} are bases of V, and that P and Q are the transition matrices from {ej 
to {/j} and from {/J to {ffj, respectively. Show that PQ is the transition matrix from {ej to {fTj}. 

7.39. Let A be a 2 by 2 matrix such that only A is similar to itself. Show that A has the form 

A 



Generalize to w X w matrices. 



= c :) 



7.40. Show that all the matrices similar to an invertible matrix are invertible. More generally, show that 
similar matrices have the same rank. 

MATRIX REPRESENTATIONS OF LINEAR MAPPINGS 

7.41. Find the matrix representation of the linear mappings relative to the usual bases for R": 
(i) F : R3 -* R2 defined by F{x, y, z) = (2x - 4j/ + 9s, 5x + Sy- 2z) 

(ii) F : Ri! ^ R* defined by F{x, y) = (3a; + 4j/, 5x -2y,x + ly, ix) 
(iii) F:R*-*B, defined by F(x, y, s,t) = 2x + 3y-7s-t 
(iv) i^ : R ^ R2 defined by F(x) = (3x, 5x) 



CHAP. 7] MATRICES AND LINEAR OPERATORS 169 

7.42. Let i<' : R3 ^ R2 be the linear mapping defined by F{x, y, z) = (2x + y — z, Sx — 2y + iz). 
(i) Find the matrix of F in the following bases of RS and R^: 

{/i = (l,l,l), /2 = (1,1,0), /a = (1,0,0)} and {jti = (1, 3), fir^ = (1, 4)} 
(ii) Verify that, for any vector v G R3, [F]^ [v]f = [F{v)]g. 

7.43. Let {ej and {/J be bases of V, and let ly be the identity mapping on V. Show that the matrix of 
ly in the bases {ej and {/;} is the inverse of the transition matrix P from {e^} to {f^}^, that is, 

7.44. Prove Theorem 7.7, page 155. {Hint. See Problem 7.9, page 161.) 

7.45. Prove Theorem 7.8. {Hint. See Problem 7.10.) 

7.46. Prove Theorem 7.9. {Hint. See Problem 7.11.) 

7.47. Prove Theorem 7.10. {Hint. See Problem 7.15.) 

MISCELLANEOUS PROBLEMS 

7.48. Let r be a linear operator on V and let W be a subspace of V invariant under T, that is, 

/A B\ 
T{W) C W. Suppose dim W — m. Show that T has a matrix representation of the form I . ) 

where A is an •m X to submatrix. ^ 

7.49. Let y = 1/ © W, and let U and W each be invariant under a linear operator T : V -> V. Suppose 

/A ON , 
dim U = m and dim V = n. Show that T has a matrix representation of the form ( I where 



B 

A and B are mXm and nX n submatrices, respectively. '' 

7.50. Recall that two linear operators F and G on y are said to be similar if there exists an invertible 
operator T on V such that G = T-^FT. 

(i) Show that linear operators F and G are similar if and only if, for any basis {ej of V, the 
matrix representations [F]^ and [G]g are similar matrices. 

(ii) Show that if an operator F is diagonalizable, then any similar operator G is also diagonalizable. 

7.51. Two mX n matrices A and B over K are said to be equivalent if there exists an m-square invertible 
matrix Q and an n-square invertible matrix P such that B — QAP. 

(i) Show that equivalence of matrices is an equivalence relation. 

(ii) Show that A and B can be matrix representations of the same linear operator F :V -> U if 
and only if A and B are equivalent. 

(iii) Show that every matrix A is equivalent to a matrix of the form ( j where / is the r-square 

identity matrix and r = rank A. V ' 

7.52. Two algebras A and B over a field K are said to be isomorphic (as algebras) if there exists a bijective 
mapping / : A -* B such that for u,v S A and k G K, (i) f{u + v) = f(u) + f(v), (ii) /(few) = fe/(w), 
(iii) f{uv) = f{u)f{v). (That is, / preserves the three operations of an algebra: vector addition, scalar 
multiplication, and vector multiplication.) The mapping / is then called an isomorphism of A onto 
B. Show that the relation of algebra isomorphism is an equivalence relation. 

7.53. Let cA be the algebra of *i-square matrices over K, and let P be an invertible matrix in cA. Show 
that the map A \-^ P~^AP, where A G c/f, is an algebra isomorphism of a4 onto itself. 



170 



MATRICES AND LINEAR OPERATORS 



[CHAP. 7 



Answers to Supplementary Problems 



7.26. (i) 



/2 -3 

\1 1 



(ii) 



6 1 
3 -2 



7.27. Here (a, 6) = (26 - 3a)/i + (2a - b)!^. 



^., ,18 25 \ ^..^ /-23 -39 
« -11 -15 j ^"^ ( 15 26 



7.28. Here (a, 6) = (4a - h)gi + (6 - Za)g2. 



-32 -45 



« (25 35/ (") V-27 -32 



35 41 



7.29. (i) 



10 

10 

,0 0, 



'2 -7 -4I 
3 14 
6-8 1 ( 



(iii) 



7.30. (i) 



1 0' 
2 1 

,0 2, 



(ii) 



(iii) 



5 101 

5 2 

,0 5, 



(iv) 



'0 1 o\ 



0-3 

iO 3 0/ 



7.31. (i) 



1 
-1 



(ii) 



-3 4 
-2 -3 



7.32. (i) 




(ii) 




(iii) 






^c 


h 





-6 


a-d 





6 


c 





d — a 


— c 



7.34. P = 



Q = 



-3 
2 



7.35. P = 



3 5 
-1 -2 



Q = 



2 5 
-1 -3 



7.36. 



8 11 
-2 -1 



7.37. P = 



2 -1 
-1 1 



7.41. (i) 



-4 9 
3 -2 




(iii) (2,3,-7,-1) (iv) 



7.42. (i) 



3 11 
-1 -8 



chapter 8 



Determinants 

INTRODUCTION 

To every square matrix A over a field K there is assigned a specific scalar called the 
determinant of A; it is usually denoted by 

det(A) or |A| 

This determinant function was first discovered in the investigation of systems of linear 
equations. We shall see in the succeeding chapters that the determinant is an indispensable 
tool in investigating and obtaining properties of a linear operator. 

We comment that the definition of the determinant and most of its properties also apply 
in the case where the entries of a matrix come from a ring (see Appendix B). 

We shall begin the chapter with a discussion of permutations, which is necessary for 
the definition of the determinant. 

PERMUTATIONS 

A one-to-one mapping <7 of the set {1,2, . . .,«} onto itself is called a permutation. We 
denote the permutation <r by 

1 2 ... n\ . . . , .... 

. . . or <T — 3i32 ■ ■ ■ 3n, where 3i = cr(i) 

h H . • . JnJ 

Observe that since o- is one-to-one and onto, the sequence /i J2 . . . jn is simply a rearrange- 
ment of the numbers 1, 2, . . . , «. We remark that the number of such permutations is n !, 
and that the set of them is usually denoted by S„. We also remark that if <7 € /S„, then the 
inverse mapping cr"^ G S„; and if a.rGSn, then the composition mapping o-orGSn. In 
particular, the identity mapping 

belongs to Sn. (In fact, e — 12...n.) 

Example 8.1 : There are 2 ! = 2 • 1 = 2 permutations in Sg: 12 and 21. 

Example 8.2: There are 3! = 3'2'1 = 6 permutations in S3: 123, 132, 213, 231, 312, 321. 

Consider an arbitrary permutation a in Sn. a = ji jz . . . jn. We say a is even or odd 
according as to whether there is an even or odd number of pairs (i, k) for which 

i> k but i precedes A; in er (*) 

We then define the sign or parity of a, written sgn <t, by 

r 1 if <7 is even 
sgna = J. 

[—1 if a IS odd 

171 



172 



DETERMINANTS 



[CHAP. 8 



Example 8.3: Consider the permutation a — 35142 in S5. 

3 and 5 precede and are greater than 1; hence (3, 1) and (5, 1) satisfy (*). 

3, 5 and 4 precede and are greater than 2; hence (3, 2), (5, 2) and (4, 2) satisfy (*). 

5 precedes and is greater than 4; hence (5, 4) satisfies (*). 

Since exactly six pairs satisfy (*), a is even and sgn <r = 1. 

Example 8.4: The identity permutation e = 12 . . . n is even since no pair can satisfy (*). 

Example 8.5 : In S2, 12 is even, and 21 is odd. 

In S3, 123, 231 and 312 are even, and 132, 213 and 321 are odd. 

Example 8.6: Let t he the permutation which interchanges two numbers i and ; and leaves the 
other numbers fixed: 

r{i) = j, r(j) = i, T(fc) = k, k^ i, j 

We call T a transposition. If i < j, then 

T = 12 ...{i-l)}{i+l) ... ij-l)i{j+l) ...n 
There are 2{j — i — l) + l pairs satisfying (*): 

(j,i), (h^)! i^t i), where x = i+1, . . .,j—l 
Thus the transposition t is odd. 

DETERMINANT 

Let A — (Oij) be an «-square matrix over a field K: 

jdll (tl2 ... O-ln^ 
A _ I 0-21 0.22 ■ . ■ 0'2n 



\a„i 



CLn2 



Consider a product of n elements of A such that one and only one element comes from each 
row and one and only one element comes from each column. Such a product can be written 
in the form 

ftlil 0.212 ■ ■ ■ ^^'n 

that is, where the factors come from successive rows and so the first subscripts are in the 
natural order 1,2, . . .,n. Now since the factors come from different columns, the sequence 
of second subscripts form a permutation <t = ji 32 . . . jn in Sn. Conversely, each permuta- 
tion in Sn determines a product of the above form. Thus the matrix A contains n\ such 
products. 

Definition: The determinant of the w-square matrix A = (Odi), denoted by det(A) or |A|, 
is the following sum which is summed over all permutations o- = ji jz . . . h 

in Sn. 

\A\ = 2/ (sgn o-)a'UiO,2j^ . . . a„j„ 



That is, 



2 (sgn C7)aicr(l) tt2(T(2) • • • Ona-in) 



The determinant of the w-square matrix A is said to be of order n and is frequently 
denoted by 



ail ai2 

0-21 0.22 
Onl 0,n2 



ttln 
0.2n 



CHAP. 8] 



DETERMINANTS 



173 



We emphasize that a square array of scalars enclosed by straight lines is not a matrix but 
rather the scalar that the determinant assigns to the matrix formed by the array of scalars. 



Example 8.7: 



Example 8.8: 



Example 8.9: 



The determinant of a 1 X 1 matrix A — (an) is the scalar on itself: \A\ = On. 
(We note that the one permutation in Sj is even.) 

In 1S2, the permutation 12 is even and the permutation 21 is odd. Hence 

''■ll'*22 '*12'*21 



Thus 









"H «12 








0-1^ a^i 


4 
-1 


-5 
-2 


= 4(- 


-2) - (-5 



-13 and 



a b 
e d 



= ad — be. 



In 1S3, the permutations 123, 231 and 312 are even, and the permutations 321, 213 and 
132 are odd. Hence 



On ai2 ai3 

0-21 '*22 "•23 
131 *32 <*33 

This may be written as: 

Oll(<l22'*33 ~ ''■23'*32) 



<*ll'*22'*33 + <*i2(l23a3l + di^a^ia^i 

— <Z'X3a220'3l — ttl2<*2l'>33 "" <*ll''23*32 

*12("21<'33 ~ ®23*3l) + '*13('''2l'''32 ~ <*22'*3l) 



a22 «23 


~ «12 


«21 "^23 


+ «13 


021 022 


O32 '''33 




«31 "33 




O3I 032 



which is a linear combination of three determinants of order two whose coefficients 
(with alternating signs) form the first row of the given matrix. Note that each 
2X2 matrix can be obtained by deleting, in the original matrix, the row and column 
containing its coefficient: 



«ii 



ttll «12 Oi3 



«.l1 



"23 
<*33 



- "12 



<»11 »I2 Oi3 
"21 'h.i «2:i 
031 "■gi '*33 



+ «13 



a31 Itj., 033 



Example 8.10: (i) 



(ii) 



2 


3 


4 




























6 


7 




5 


7 




5 


6 


6 


6 


7 


= 2 






- 3 






+ 4 














9 


1 




8 


1 




8 


9 


8 


9 


1 





















2(6-63) - 3(5-56) + 4(45-48) 



27 



2 3 


-4 
























—4 


v 







2 




-4 


-4 


2 


= 2 


-1 


5 


- 3 


1 


5 


+ (-4) 


1 -1 


1 -1 


5 



















2(-20 + 2) 



As n increases, the number of terms in the determinant becomes astronomical. Accord- 
ingly, we use indirect methods to evaluate determinants rather than its definition. In fact 
we prove a number of properties about determinants which will permit us to shorten the 
computation considerably. In particular, we show that a determinant of order n is equal 
to a linear combination of determinants of order m. — 1 as in case n = 3 above. 



PROPERTIES OF DETERMINANTS 

We now list basic properties of the determinant. 
Theorem 8.1: The determinant of a matrix A and its transpose A* are equal: \A\ = \A*\. 



174 DETERMINANTS [CHAP. 8 

By this theorem, any theorem about the determinant of a matrix A which concerns the 
rows of A will have an analogous theorem concerning the columns of A. 

The next theorem gives certain cases for which the determinant can be obtained 
immediately. 

Theorem 8.2: Let A be a square matrix. 

(i) If A has a row (column) of zeros, then \A\ = 0. 
(ii) If A has two identical rows (columns), then |A| = 0. 
(iii) If A is triangular, i.e. A has zeros above or below the diagonal, then 
\A\ = product of diagonal elements. Thus in particular, |/| = 1 where 
/ is the identity matrix. 

The next theorem shows how the determinant of a matrix is affected by the "elementary" 
operations. 

Theorem 8.3: Let B be the matrix obtained from a matrix A by 

(i) multiplying a row (column) of A by a scalar fc; then |B| = fe|A|. 

(ii) interchanging two rows (columns) of |A|; then |Z?j = — |A|. 

(iii) adding a multiiJle of a row (column) of A to another; then jB] = |A|. 

We now state two of the most important and useful theorems on determinants. 

Theorem 8.4: Let A be any n-square matrix. Then the following are equivalent: 
(i) A is invertible, i.e. A has an inverse A~^. 
(ii) A is nonsingular, i.e. AX - has only the zero solution, or rank A = n, 

or the rows (columns) of A are linearly independent, 
(iii) The determinant of A is not zero: |A| ¥= 0. 

Theorem 8.5: The determinant is a multiplicative function. That is, the determinant of 
a product of two matrices A and B is equal to the product of their deter- 
minants: \A B\ = \A\ \B\ . 

We shall prove the above two theorems using the theory of elementary matrices (see 
page 56) and the following lemma. 

Lemma 8.6: Let E be an elementary matrix. Then, for any matrix A, \E A\ - \E\\A\. 
We comment that one can also prove the preceding two theorems directly without 
resorting to the theory of elementary matrices. 

MINORS AND COFACTORS 

Consider an w-square matrix A = (ay). Let M« denote the (w- l)-square submatrix of 
A obtained by deleting its ith row and .7th column. The determinant \Mij\ is called the mmor 
of the element ay of A, and we define the cof actor of Oni, denoted by A«, to be the "signed" 

^^""^= A« = i-iy^^m 

Note that the "signs" (-1)*+^' accompanying the minors form a chessboard pattern with 
+'s on the main diagonal: 

■ ; : ; ::-\ 

+ - + -■■ 



We emphasize that My denotes a matrix whereas Ay denotes a scalar. 



CHAP. 8] 



DETERMINANTS 



175 



Example 8.11: Let A = 




2 3 4 

Then M23 = { 5 6 7 
8 9 1 



2 3 
8 9 



and 



= (-1)2 



2 3 
8 9 



= -(18-24) = 6 



The following theorem applies 
Theorem 8.7 



The determinant of the matrix A = (Odj) is equal to the sum of the products 
obtained by multiplying the elements of any row (column) by their re- 
spective cofactors: „ 

\A\ = OiiAii + ai2Ai2 + • • • + ttinAin = 2 o^a-^a 



j=i 



and 



CLljAij + Ct2i-'4.2j + • • • + anjAnj — ^ OijAij 



The above formulas, called the Laplace expansions of the determinant of A by the ith. 
row and the yth column respectively, offer a method of simplifying the computation of \A\. 
That is, by adding a multiple of a row (column) to another row (column) we can reduce A 
to a matrix containing a row or column with one entry 1 and the others 0. Expanding by 
this row or column reduces the computation of \A\ to the computation of a determinant of 
order one less than that of \A\. 

5 4 2 l\ 

2 3 1 —2 
Example 8.12: Compute the determinant of A = ' 



-5 -7 -3 
1 -2 -1 



Perform the following 



Note that a 1 appears in the second row, third column, 
operations on A, where fij denotes the ith row: 

(i) add -2R2 to jRi, (ii) add 3^2 to Ra, (iii) add li?2 to R^. 

By Theorem 8.3(iii), the value of the determinant does not change by these opera- 
tions; that is. 



\A\ = 



Now if we expand by the third column, we may neglect all terms which contain 0. 
Thus 



5 4 2 1 

2 3 1-2 

-5 -7-3 9 

1-2-1 4 



1 


-2 





5 


2 


3 


1 


-2 


1 


2 





3 


3 


1 





2 



\A\ = (-1)2 



{ 



1 


_2 





5 


2 


3 


1 


-2 


1 


2 





3 


3 


1 





2 



1-2 5 

= -123 

3 12 



2 3 
1 2 


- (-2) 


1 3 
3 2 


+ 5 


1 2 
3 1 



} = 



38 



CLASSICAL ADJOINT 

Consider an n-square matrix A = (an) over a field K: 



A = 



' an ai2 

0,21 ffl22 



ttin 
tt2re 



1 fflnl ffln2 



176 



DETERMINANTS 



[CHAP. 8 



The transpose of the matrix of cofactors of the elements Oij of A, denoted by adj A, is called 
the classical adjoint of A: 

'An A21 ... A„i \ 

adj A - '^^^ ^^^ ••• ^"^ 



■^nn I 



We say "classical adjoint" instead of simply "adjoint" because the term adjoint will be used 
in Chapter 13 for an entirely different concept. 



Example 8.13: Let A - 

An = + 

■^21 — — 
A31 = + 




The cofactors of the nine elements of A are 



-4 2 
-1 5 

3 -4 
-1 5 

3 -4 

-4 2 



= -18, 


A12 — 




1 


2 
5 


= -11, 


A22 - + 


2 

1 


-4 
5 


= -10, 


A32 ~ — 


2 



-4 
2 



= 2, Ai3 = + 
= 14, A23 = - 
= -4, A33 = + 



-4 

1 -l| 

2 3 

1 -1 

2 3 

-4 



= 4 



= 5 



= -8 



Theorem 8.8: 



We form the transpose of the above matrix of cofactors to obtain the classical adjoint 
of A: 

I -18 -11 -10 \ 
adj ^ = 2 14-4 

\ 4 5-8/ 

For any square matrix A, 

A -(adj A) = (adj A) -A = \A\I 
where / is the identity matrix. Thus, if \A\¥' 0, 

Observe that the above theorem gives us an important method of obtaining the inverse 
of a given matrix. 

Example 8.14: Consider the matrix A of the preceding example for which \A\ — —46. We have 

/2 3 -4\ /-18 -11 -10\ /-46 0\ /l 0^ 

A (adj A) = -4 2 2 14 -4 = 0-46 = -46 1 

\l -1 5/\ 4 5 -8/ \ -46/ \0 1^ 

= -46/ = |A|/ 
We also have, by Theorem 8.8, 

-18/-46 -11/-46 -10/-46\ / 9/23 11/46 5/23 \ 



A-i = r4T(adjA) 



2/-46 14/-46 -4/-46 = -1/23 -7/23 2/23 
4/-46 5/-46 -8/-46/ \-2/23 -5/46 4/23/ 



APPLICATIONS TO LINEAR EQUATIONS 

Consider a system of n linear equations in n unknowns: 

anX\ + ai2a;2 + • • • + ai„a;» = bi 
a2\X\ + a22a;2 + • • • + ainXn — &2 



anliCi + an%X2 + 



"T annXn 



bn 



CHAP. 8] DETERMINANTS 177 

Let A denote the determinant of the matrix A — (oij) of coefficients: A = \A\. Also, let As 
denote the determinant of the matrix obtained by replacing the ith column of A by the 
column of constant terms. The fundamental relationship between determinants and the 
solution of the above system follows. 

Theorem 8.9: The above system has a unique solution if and only if A ?^ 0. In this case 
the unique solution is given by 

_ Al _ ^ _ An 

The above theorem is known as "Cramer's rule" for solving systems of linear equations. 
We emphasize that the theorem only refers to a system with the same number of equations 
as unknowns, and that it only gives the solution when A ^ 0. In fact, if A = the theorem 
does not tell whether or not the system has a solution. However, in the case of a homo- 
geneous system we have the following useful result. 

Theorem 8.10: The homogeneous system Ax — has a nonzero solution if and only if 

A = |A| = 0. 

\2x — Zy = 1 
Example 8.15: Solve, using determinants: < 

3a; + 52/ = 1 



First compute the determinant A of the matrix of coefficients: 

2 -3 

3 5 

Since A ^^ 0, the system has a unique solution. We also have 



A = 



= 10 + 9 = 19 



A. = 



7 -3 
1 5 



= 38, 



2 7 

3 1 



-19 



Accordingly, the unique solution of the system is 



^x 38 - ^y -19 

*' = T"i9 = 2, ^ = T=l9- = -i 

We remark that the preceding theorem is of interest more for theoretical and historical 
reasons than for practical reasons. The previous method of solving systems of linear equa- 
tions, i.e. by reducing a system to echelon form, is usually much more efficient than by using 
determinants. 

DETERMINANT OF A LINEAR OPERATOR 

Using the multiplicative property of the determinant (Theorem 8.5), we obtain 
Theorem 8.11: Suppose A and B are similar matrices. Then |A| = \B\. 

Now suppose T is an arbitrary linear operator on a vector space V. We define the 
determinant of T, written det (T), by 

det(r) = |[r]e| 

where [T]e is the matrix of T in a basis {et}. By the above theorem this definition is in- 
dependent of the particular basis that is chosen. 

The next theorem follows from the analogous theorems on matrices. 

Theorem 8.12: Let T and iS be linear operators on a vector space V. Then 

(i) det (S o T) = det {S) • det {T), 

(ii) r is invertible if and only if det (7)^0. 



178 



DETERMINANTS 



[CHAP. 8 



We also remark that det (Iv) = 1 where Iv is the identity mapping, and that det (T~^) - 
det(r)-i if T is invertible. 

Example 8.16: Let T be the linear operator on R3 defined by 

T(x, y, z) — (2x — 4y + z, X — 2y + 3z, 5x + y — z) 

'2 -4 l\ 
The matrix of T in the usual basis of R3 is [T] = \ X -2 3 . Then 

,5 1-1/ 



det(r) = 



2-4 1 
1-2 3 
5 1-1 



2(2 - 3) + 4(-l - 15) + 1(1 + 10) = -55 



MULTILINEARITY AND DETERMINANTS 

Let cA denote the set of all n-square matrices A over a field K. We may view A as an 
n-tuple consisting of its row vectors Ai, A2, . . ., A„: 

A = {Ai, A2, . . ., An) 

Hence cA may be viewed as the set of n-tuples of w-tuples in K: 

The following definitions apply. 

Definition: A function D.cA -^ K is said to be multilinear if it is linear in each of the 
components; that is: 

(i) if row Ai = B + C, then 

D{A) = D(...,B + C,...) = D{...,B, ...) + D(...,C, ...); 

(ii) if row Ai = kB where k G K, then 

D{A) = D(...,kB, ...) = kD{...,B, ...). 
We also say n-linear for multilinear if there are n components. 

Definition: A function D-.cA^K is said to be alternating if D{A) = whenever A has 
two identical rows: 

D{Ai, A2, . . ., An) = whenever A, - Aj, i¥^ j 

We have the following basic result; here / denotes the identity matrix. 

Theorem 8.13: There exists a unique function D-.oA -*K such that: 

(i) D is multilinear, (ii) D is alternating, (iii) D{I) = 1. 

This function D is none other than the determinant function; that is, for 
any matrix A^cA, D(A) = jA|. 



CHAP. 8] 



DETERMINANTS 



179 



Solved Problems 



COMPUTATION OF DETERMINANTS 

8.1. Evaluate the determinant of each matrix: (i) 



(i) 



3 -2 

4 5 



= 3-5- (-2) -4 = 23. 



(ii) 



^4 5 

a — b a 
a a + b 



(ii) 



a — b a 
a a + b 



= (a—b)(a+b) — a'a 



-62. 



8.2. Determine those values of k for which 



k k 
4 2k 



k k 

4 2k 
fe = 2, the determinant is zero. 



= 0. 



= 2A;2 - 4fc = 0, or 2fc(fe - 2) = 0. Hence k - Q; and k = 2. That is, if fe = or 



8.3. Compute the determinant of each matrix: 

'1 2 3\ /2 l\ / 2 1^ 



(i) [4-2 3 
^2 5 -ly 



(i) 



1 


2 


3 




























-2 


3 




4 


3 




4 


-2 


4 


-2 


3 


= 1 






- 2 






+ 3 














5 


-1 




2 


-1 




2 


5 


2 


5 


-1 





















(ii) 



(iii) 



(iv) 



1(2-15) - 2(-4-6) + 3(20 + 4) = 79 

|4 2 



2 





1 




2 


-3 




4 


-3 


4 


2 


-3 


= 2 


3 


1 


- 


5 


1 


5 


3 


1 















+ ll 



6 3 



2 1 

3 2-3 
-1 -3 5 

10 

3 2-4 

4 13 



= 2(10-9) + l(-9 + 2) = -5 



= 1(6 + 4) = 10 



= 24 



0\ 



(ii) (4 2 -3 ) , (iii) ( 3 2 -3 I , (iv) I 3 2 -4 | . 

\-l -3 



8.4. Consider the 3-square matrix A — \a2 &2 C2 

\a3 bs cs 
be used to obtain the determinant of A: 




Show that the diagrams below can 




Form the product of each of the three numbers joined by an arrow in the diagram on the left, 
and precede each product by a plus sign as follows: 



180 



DETERMINANTS 



[CHAP. 8 



Now form the product of each of the three numbers joined by an arrow in the diagram on the 
right, and precede each product by a minus sign as follows: 

— asftgCi — bgC^ai — Cgagfei 
Then the determinant of A is precisely the sum of the above two expressions: 
«! 61 Ci 



\A\ = 



a2 62 "2 

«3 &3 "3 

The above method of computing \A\ does not hold for determinants of order greater than 3. 



8.5. Evaluate the determinant of each matrix: 

/2 -l\ /a b c> 



(i) 



3 2 

A -3 7/ 



(ii) 




(iii) 



/ 3 2 - 

10-2 
\-2 3 3/ 



(i) Expand the determinant by the second column, neglecting terms containing a 0: 



2 0-1 

3 2 
4-3 7 



= -(-3) 



2 -1 

3 2 



3(4 + 3) 



21 



(ii) Use the method of the preceding problem: 



a 


b 


c 


c 


a 


b 


b 


c 


a 



= a^ + 63 + c* — abc — abc — abc = 



63 + c3 



3a6c 



(iii) Add twice the first column to the third column, and then expand by the second row: 

3 2 2 

10 

-2 3 -1 



3 2-4 + 2(3) 

1 -2 + 2(1) 

-2 3 3 + 2(-2) 



2 2 

3 -1 



= 8 



fi -1 



8.6. Evaluate the determinant of A = 



4 




First multiply the first row by 6 and the second row by 4. Then 

3-6-2 3 -6 + 4(3) -2 - (3) 

3 2-4 = 1 2 + 4(3) -4 -(3) 
1-4 1 1 -4 + 4(1) 1~(1) 



6-4|A| - 24|A| = 



= + 



6 
14 



28, 



and \A\ = 28/24 = 7/6. 



3 6-5 
1 14 -7 
10 




8.7. Evaluate the determinant of A 



Note that a 1 appears in the third row, first column. Apply the following operations on A 
(where i2j denotes the ith row): (i) add —2R3 to iSj, (ii) add 2R3 to i?2, (iii) add IB3 to R^. Thus 



CHAP. 8] 



DETERMINANTS 



181 



\A\ = 



2 


5 


-3 


-2 




2 


-3 


2 


-5 




1 


3 


-2 


2 




1 


-6 


4 


3 








-1 


1 


-6 







3 


-2 


-1 


= + 


1 


3 


-2 


2 







-3 


2 


5 





-1 + 1 1 -6 + 6(1) 
3-2 -2 -1 + 6(-2) 
-3 + 2 2 5 + 6(2) 



-1 1 -6 

3 -2 -1 

-3 2 5 






1 







1 


-13 


1 


-2 


-13 


— — 


-1 


17 


1 


2 


17 









= -4 




8.8. Evaluate the determinant of A 



First reduce A to a matrix which has 1 as an entry, such as adding twice the first row to the 
second row, and then proceed as in the preceding problem. 



|A| = 



3 


-2 


-5 


4 




■5 


2 


8 


-5 




2 


4 


7 


-3 




2 


-3 


-5 


8 





3 


-2 


-5 


4 




1 


-2 


-2 


3 




2 


4 


7 


-3 




2 


-3 


-5 


8 





3 


4 


1 


-5 




1 













2 





3 


3 




2 


1 


-1 


2 





4 1-6 
3 
1-13 



3-2 -5 4 

-5 + 2(3) 2 + 2(-2) 8 + 2(-5) -5 + 2(4) 
-2 4 7-3 

2-3-5 8 

3 -2 + 2(3) -5 + 2(3) 4-3(3) 

1 -2 + 2(1) -2 + 2(1) 3-3(1) 
-2 4 + 2(-2) 7 + 2(-2) -3-3(-2) 

2 -3 + 2(2) -5 + 2(2) 8-3(2) 



4 1 -5-(l) 

3 3 - (3) 

1 -1 2-(-l) 



4 


1 


-5 







3 


3 


= — 


1 


-1 


2 





4 -6 
1 3 



-3(12 + 6) = -54 



/t + S -1 1 

8.9. Evaluate the determinant of A = I 5 t-B 1 

\ 6 -6 t + 4/ 

Add the second column to the first column, and then add the third column to the second column 
to obtain 

t + 2 1 

\A\ = t + 2 t-2 1 

t-2 t + 4 

Now factor t + 2 from the first column and t — 2 from the second column to get 

10 1 
|A| = (t + 2)(t-2) 11 1 
1 « + 4 

Finally subtract the first column from the third column to obtain 



\A\ = (< + 2)(t-2) 



10 

11 

1 t + 4 



= (t+2)(t-2)(t + 4) 



182 



DETERMINANTS 



[CHAP. 8 



COFACTORS 

8.10. Find the cofactor of the 7 in the matrix 




2 


1 


-3 


■1 




5 


-4 


7 


-2 


^ 


4 





6 


-3 




3 


-2 


5 


2 





2 


1 


4 




2 


1 4 


/ 


4 -3 


4 





-3 


rz — 


4 


-3 




7 10 


3 


-2 


2 




7 


10 


\ 





(-1)2 



The exponent 2 + 3 comes from the fact that 7 appears in the second row, third column. 



61 



/I 2 3\ 
8.11. Consider the matrix A = 2 3 4 , (i) Compute |A| , (ii) Find adj A. (iii) Verify 

\l 5 7/ 
A • (adj A) = \A\ I. (iv) Find A-\ 



(i) \A\ = 1 



3 4 
5 7 



2 4 
1 7 



+ 3 



2 3 

1 sl 



= 1-20 + 21 = 2 



(ii) adj A 



+ 


3 4 
5 7 


- 


2 4 
1 7 


+ 


2 3 

5 7 


+ 


1 3 

1 7 




+ 


2 3 

3 4 


- 


1 3 

2 4 


+ 






That is adj A is the transpose of the matrix of cof actors. Observe that the "signs" in the 

u - +\ 

matrix of cofactors form the chessboard pattern — + — . 

\+ - +/ 



(iii) A '(adj A) = 



(iv) A-i = |]4i(adjA) 






= |A|/ 



a b 
c d 



8.12. Consider an arbitrary 2 by 2 matrix A • 

(i) Find adj A. (ii) Show that adj (adj A) = A. 

(X) adj A = (^_|^| ^1^1 j - \^_^ „; - (^_, „; 

/ +la| -l-ciy ^ /a c 

- \^_|_6| +|di; v'' ^ 



(ii) adj (adj A) = adj 



d -b 
-e a 



CD- 



= A 



CHAP. 8] 



DETERMINANTS 



183 



DETERMINANTS AND SYSTEMS OF LINEAR EQUATIONS 
8.13. Solve for x and y, using determinants: 

2a; + y = 1 , __ ax — 2hy — c 



(i) 



3x — 5y — 4 



2 1 

3 -5 
y = A„/A = 1. 



(ii) , where ab ¥^ 0. 

Sax — 5by — 2c 



-13, A:, = 



7 1 
4 -5 



= -39, Ay = 



2 7 

3 -4 



= -13. Then x = A^/A = 3, 



(ii) A = 



= a6, Ax = 



a -26 
3a -56 
—c/a, y — Aj,/A = —c/b 



c -26 
2c -56 



-be, Aj, = 



a. c 
3a 2c 



= — ac. Then a; = A^-M = 



8.14. Solve using determinants: 



Sy + 2x = z + 1 
3a; + 22; - S - 5y . 
3z - 1 = a; - 2i/ 

First arrange the system in standard form with the unknowns appearing in columns: 

2x + 3y - z = 1 

3x + 5y + 2z - 8 
X -2y - 3z = -1 

Compute the determinant A of the matrix A of coefficients: 

2 3-1 

3 5 2 



A = 



-2 -3 



= 2(-15 + 4) - 3(-9-2) - l(-6-5) = 22 



Since A t^ 0, the system has a unique solution. To obtain A^., A,, and A^, replace the coefficients of 
the unknown in the matrix A by the column of constants. Thus 



A, = 



13-1 

8 5 2 

-1 -2 -3 



= 66, A„ = 



2 1-1 

3 8 2 
1 -1 -3 



-22, A, = 



and X - Aj./A = 3, 2/ = Aj,/A = -1, z = A^/A = 2. 



2 3 1 

3 5 8 
1 -2 -1 



= 44 



PROOF OF THEOREMS 

8.15. Prove Theorem 8.1: |A*| = |A|. 

Suppose A = (tty). Then A* = (6jj) where 6y = a^;. Hence 

l^'l = 2 (Sgn a) 6io.(i) 62,^(2) . . . 6„o-(n) 

aes„ 
= 2 (sgn ff)a„(i).ia<r(2),2 ••• «o-(n),n 

Let T = <r~i. By Problem 8.36, sgn r = sgn a, and 

Hence |A«| = 2 (sgn t) ai^d) a2T(2) • • • «nT(n) 

<Tes„ 

However, as a runs through all the elements of S„, t = <r~i also runs through all the elements of 
S„. Thus \At\ = |A|. 



8.16. Prove Theorem 8.3(ii): Let B be obtained from a square matrix A by interchanging 
two rows (columns) of A. Then |B| = — |A|. 

We prove the theorem for the case that two columns are interchanged. Let t be the trans- 
position which interchanges the two numbers corresponding to the two columns of A that are 
interchanged. If A = (oy) and B — (6jj), then 6y — Ojtcj)- Hence, for any permutation a. 



184 DETERMINANTS [CHAP. 8 

Thus \B\ = 2 (sgn ff)6io.(i) 620.(2) ... &no-(n) 

= 2 (sgn a) OiTirCl) "■2to-(2) • • • "-nraCn) 

Since the transposition t is an odd permutation, sgn ra = sgn r • sgn <r = — sgn a. Thus sgn a = 
— sgn TO, and so 

\B\ = — 2 (sgn TOr) a.iTO-(l)a2T<r(2) • • • "^nrcrCn) 

But as a runs through all the elements of S„, to also runs through all the elements of S^; hence 

\B\ = -\A\. 

8.17. Prove Theorem 8.2: (i) If A has a row (column) of zeros, then \A\ = 0. (ii) If A has 
two identical rows (columns), then \A\ - 0. (iii) If A is triangular, then \A\ = product 
of diagonal elements. Thus in particular, |/| = 1 where / is the identity matrix. 

(i) Each term in |A| contains a factor from every row and so from the row of zeros. Thus each 
term of |A| is zero and so \A\ = 0. 

(ii) Suppose 1 + 1 # in K. If we interchange the two identical rows of A, we still obtain the 
matrix A. Hence by the preceding problem, 1A| = — |A| and so \A\ = 0. 

Now suppose 1 + 1 = in K. Then sgn <r = 1 for every a e S„. Since A has two iden- 
tical rows, we can arrange the terms of A into pairs of equal terms. Since each pair is 0, the 
determinant of A is zero. 

(iii) Suppose A = (ay) is lower triangular, that is, the entries above the diagonal are all zero: 
a^j = whenever i < j. Consider a term t of the determinant of A: 

t = (sgn <r) aiij a2i2 • • • '*"*n' where <t = i^H ...in 
Suppose ii ¥- 1. Then 1< ii and so a^^ = 0; hence t = 0. That is, each term for which 
ii 7^ 1 is zero. 

Now suppose ti = 1 but iz ¥- 2. Then 2 < ig and so a^ = 0; hence t = 0. Thus each 
term for which ij ^ 1 or 12 ^ 2 is zero. 

Similarly we obtain that each term for which ij 7^ 1 or % # 2 or ... or t„ 9^ n is zero. 
Accordingly, 1A| = a^^a^^ . . . a^n = product of diagonal elements. 

8.18. Prove Theorem 8.3: Let B be obtained from A by 

(i) multiplying a row (column) of A by a scalar fe; then |B| = fe |A| . 
(ii) interchanging two rows (columns) of A; then |B| = - |A|. 
(iii) adding a multiple of a row (column) of A to another; then |B1 = \A\. 
(i) If the jth row of A is multiplied by fc, then every term in |A| is multiplied by fc and so 
\B\ = k\A\. That is, 

|B| = 2 (sgn o) an a2t2 • • • C^^jiP • • • «ni„ 

= fc 2 (sgn a) aii a2i2 . . . Oni„ = ^ 1^1 
a 

(ii) Proved in Problem 8.16. 

(iii) Suppose c times the feth row is added to the jth row of A. Using the symbol /\ to denote the 
yth position in a determinant term, we have 

\B\ = 2 (sgn or) aii aji^ . . . {ca^i^ + ajj.) . . . a„i^ 

= c 2 (sgn <r) a„j agi^ • . • «fci^ • • ■ ««!„ + 2 (sgn c) a^i^ a^i^. . . a^. . . . a„i^ 

The first sum is the determinant of a matrix whose feth and ;th rows are identical; 
hence by Theorem 8.2(ii) the sum is zero. The second sum is the determinant of A. Thus 

|B| = c'0 + 1A| = A. 



CHAP. 8] DETERMINANTS 185 

8.19. Prove Lemma 8.6: For any elementary matrix £", l^'A] = IE"! |A|, 

Consider the following elementary row operations: (i) multiply a row by a constant A; # 0; 
(ii) interchange two rows; (iii) add a multiple of one row to another. Let E^, JS?2 and E^ be the 
corresponding elementary matrices. That is, Sj, E^ and E^ are obtained by applying the above 
operations, respectively, to the identity matrix /. By the preceding problem, 

l^il = k\I\ = k, \E^\ = -\I\ = -1, \E,\ = |/| = 1 

Recall (page 56) that SjA is identical to the matrix obtained by applying the corresponding 
operation to A. Thus by the preceding problem, 

\E^A\ = k\A\ = \Ei\\A\, lE^A] = -\A\ = l^^l lA], \E,A\ = \A\ = 1|A| = I^g] 1A| 
and the lemma is proved. 

8.20. Suppose B is row equivalent to A; say B = EnEn-i . . . E2E1A where the E, are 
elementary matrices. Show that: 

(i) \B\ = \En\ \Er,-i\ . . . \E2\ \Ei\ \A\, (ii) \B\ ^ if and only if \A\ ^ 0. 

(i) By the preceding problem, |J7iA| = |Bi| 1A|. Hence fey induction, 

\B\ = \E„\\E„_,...E2E,A\ = \E^\\E,_,\...\E2\\E,\\A\ 

(ii) By the preceding problem, ^i ^ for each i. Hence \B\ ¥= if and only if \A\ ¥- 0. 

8.21. Prove Theorem 8.4: Let A be an w-square matrix. Then the following are equivalent: 
(i) A is invertible, (ii) A is nonsingular, (iii) |A| 9^ 0. 

By Problem 6.44, (i) and (ii) are equivalent. Hence it suffices to show that (i) and (iii) are 
equivalent. 

Suppose A is invertible. Then A is row equivalent to the identity matrix /. But |/| ■?* 0; hence 
by the preceding problem, |A| ^ 0. On the other hand, suppose A is not invertible. Then A is row 
equivalent to a matrix B which has a zero row. By Theorem 8.2(i), \B\ = 0; then by the preceding 
problem, \A\ = 0. Thus (i) and (iii) are equivalent. 

8.22. Prove Theorem 8.5: \AB\ = \A\\B\. 

If A is singular, then AB is also singular and so \AB\ = = |A| |B|. On the other hand if A 
is nonsingular, then A =^ E^ . . . E2E1, a product of elementary matrices. Thus, by Problem 8.20, 

|A| = \E^...E^E,I\ = \E„\...\E2\\E,\\I\ = \EJ...\E2\\E,\ 
and so |AJ5| = \E„...E2E,B\ = \EJ . . . lE^WE^WB] = |A| |B| 

8.23. Prove Theorem 8.7: Let A = (a«); then \A\ = anAn + ttizAia + • • • + ai„Ai„, where 
Aij is the cofactor of an. 

Each term in \A\ contains one and only one entry of the ith row (aij.Ojg, . . ., a,„) of A. Hence 
we can write \A \ in the form 

|A| = ajiAfi + ai2A*2 + ■■■ + ai„Af„ 

(Note Ay is a sum of terms involving no entry of the ith row of A.) Thus the theorem is proved if 
we can show that 

At. = A;,. = (-l)«+i|M„I 

where Afy is the matrix obtained by deleting the row and column containing the entry ay. (His- 
torically, the expression Ay was defined as the cofactor of Oy, and so the theorem reduces to showing 
that the two definitions of the cofactor are equivalent.) 



186 DETERMINANTS [CHAP. 8 

First we consider the case that i = n, j = n. Then the sum of terms in \A\ containing a„„ is 

Orm'^nn = «nn 2 (sgn a) ffli<r(i) 02<t(2) • • • <»n-l,cr(n-l) 
a 

where we sum over all permutations aSS„ for which ain) = n. However, this is equivalent (Prob- 
lem 8.63) to summing over all permutations of {1, . . .,n-l}. Thus A*„ = |M„„| = (-!)»+« 1M„„| . 

Now we consider any i and }. We interchange the ith. row with each succeeding row until it is 
last, and we interchange the jth column with each succeeding column until it is last. Note that the 
determinant |Afy| is not affected since the relative positions of the other rows and columns are not 
affected by these interchanges. However, the "sign" of |A| and of Ay is changed n — i and then 
n — j times. Accordingly, 

A% = (-l)»-i + »-i |M„| = (-l)* + MMy| 

8.24. Let A = (an) and let B be the matrix obtained from A by replacing the ith row of A 
by the row vector (bn, . . . , &m). Show that 

|B| = biiAn + bt2Aa + • • • 4- &i„Ai„ 

Furthermore, show that, for j ¥= i, 

ajiAn + (lizAti + • • • + ttjnAin — 
and aijAii + 023^2! + • • • 4- a„iA„i = 

Let B = (6y). By the preceding problem, 

\B\ = ftiiBa + 6i2^i2 + ■ • • + 6i„Bi„ 

Since By does not depend upon the ith row of B, By = Ay for j = 1 n. Hence 

\B\ = ftjiAji + 6i2Ai2 + • • • + 6i„Ai„ 

Now let A' be obtained from A by replacing the ith row of A by the jth row of A. Since A' 
has two identical rows, |A'| = 0. Thus by the above result, 

|A'| = ajiAji + aj2-^i2 + • • • + O-jn^ln = 

Using |A*| — \A\, we also obtain that Oti-^ii + «2j-^2i + •" • + «ni-Ani = 0. 

8.25. Prove Theorem 8.8: A-(adjA) = (adjA)-A = |A|/. Thus if |AI ^ 0, A-* = 
(l/|Al)(ad3A). 

Let A = (ay) and let A • (adj A) = (fty). The ith row of A is 

(«ii. ffliz. • • • . <*»tl) '^) 

Since adj A is the transpose of the matrix of cofactors, the ith column of adj A is the transpose of 
the cofactors of the jth row of A: 

(A,i,Aj2, ...,A,„)« (2) 

Now by, the ij-entry in A • (adj A), is obtained by multiplying (1) and (2): 

6y = a^Aji + Uj^Ajz + • • • + ttin'Ajn 

Thus by Theorem 8.7 and the preceding problem, 

" \ if i ^ i 

Accordingly, A -(adj A) is the diagonal matrix with each diagonal element \A\. In other words, 
A • (adj A) = \A] I. Similarly, (adj A)-A = \A\ 1. 



CHAP. 8] DETERMINANTS 187 

8.26. Prove Theorem 8.9: The system of linear equations Ax = b has a unique solution 
if and only if A = ]A| ^ 0. In this case the unique solution is given by Xi = Ai/A, 

X2 = A2/A, . . . , Xn = An/A. 

By preceding results, Ax = b has a unique solution if and only if A is invertible, and A is 
invertible if and only if A = |A| # 0. 

Now suppose A#0. By Problem 8.25, A-i = (1/A)(adj A). Multiplying Ax = b by A -1, we 
obtain 

X = A-^Ax = (1/A)(adj A)b (1) 

Notethattheithrowof (l/A)(adjA) is (l/A)(Aii, A^, . . ., A„j). If 6 = (61, 63, • • ., &„)' then, by (i), 

Xi = (1/A)(6i Aii + b^A^i +■■■+ 6„A J 

However, as in Problem 8.24; 

6iAii + 62^21 + • • • + 6„A„j = A; 

the determinant of the matrix obtained by replacing the ith column of A by the column vector 6. 
Thus Xi = (l/A)Aj, as required. 

8.27. Suppose P is invertible. Show that |P-i] = |P|-i. 

P-^P = I. Hence 1 = |/| = |p-ip| = |p-i| |p|, and so |P-i| = |P|-i. 

8.28. Prove Theorem 8.11: Suppose A and S are similar matrices. Then |A| = |B|, 

Since A and B are similar, there exists an invertible matrix P such that B = P-^AP. Then 
by the preceding problem, |P| = |P-iAP| = |P-i| \A\ \P\ = \A\ lP-i| \P\ - \A\ . 

We remark that although the matrices P-i and A may not commute, their determinants |P-i| 
and \A\ do commute since they are scalars in the field K. 

8.29. Prove Theorem 8.13: There exists a unique function D.cA^K such that (i) D is 
multilinear, (ii) D is alternating, (iii) D(/) = 1. This function D is the determinant 
function, i.e. D{A) = |A|. 

Let D be the determinant function: D(A) = \A\. We must show that D satisfies (i), (ii) and (iii), 
and that D is the only function satisfying (1), (ii) and (iii). 

By preceding results, D satisfies (ii) and (iii); hence we need show that it is multilinear. Suppose 
■A = (««) = (Ai, Ag, . . ., A„) where A^ is the fcth row of A. Furthermore, suppose for a fixed i, 

Aj = Bi + Cj, where Bi = (b^, . . . , 6„) and Q = (ci, . . . , c„) 

Accordingly, a^ = 61 + Cj, ajj =63 + 02, ..., «;„ = 6„ + c„ 

Expanding D(A) - \A\ by the ith row, 

D(A) = D(A„ . . ., Bi + Ci, . . ., A„) = aaA^, + at^A^^ + ■■■ + Ui^A^, 

= (61 + ci)Aii + (62 + C2)A;2 + . . . + (6„ + c„)Ai„ 

= (fciAii + 62Ai2 + • • • + 6„Ai„) + (ciA(i + C2A,.2 + • ■ • + c^Ai^) 

However, by Problem 8.24, the two sums above are the determinants of the matrices obtained from 
A by replacing the ith row by Bj and Q respectively. That is, 

D(A) = I>(A„...,Bi + Ci, ...,A„) 

= I>(Ai, ...,Bi, ...,A„) + Z)(Ai, ...,Ci, ...,A„) 

Furthermore, by Theorem 8.3(i), 

Z)(Ai, ...,fcAj A„) = fcD(A„...,Ai, ...,A„) 

Thus B is multilinear, i.e. D satisfies (iii). 



188 DETERMINANTS [CHAP. 8 

We next must prove the uniqueness of D. Suppose D satisfies (i), (ii) and (iii). If {e^, . . . , e„} 
is the usual basis of K", then by (iii), Z>(ei, eg, . . ., e„) = D{I) = 1. Using (ii) we also have (Problem 
8.73) that 

Z)(ejj,ei2. •■■'%) = sgna, where a = in^ . . . i^ (D 

Now suppose A = (ay). Observe that the fcth row A^^ of A is 

^k = («ki. «fc2. ■■■> "fcn) = «fci«i + "k2«2 + ■ • • + afc„e„ 

Thus I'(A) = I>(aiiei + • • • + ai„e„, 02161 + • • • + a2„e„, . . . , o„iei + • • ■ + a„„e„) 

Using the multilinearity of D, we can write D(A) as a sum of terms of the form 

= 2 ("Hj a2i2 • • • '^"iJ ^'•\' %• ■■■' K^ 

where the sum is summed over all sequences 11^2 • ■ ■ in where i^ G {1, . . . , n}. If two of the indices 
are equal, say ij = i^ but j ¥' k, then by (ii), 

^(%. %, ■•■'\) = 

Accordingly, the sum in (2) need only be summed over all permutations a = i^i^ ■ ■ . ■J„. Using (1), 
we finally have that 

D{A) = 2 («iii "212 •• • «ni„) D{ei^, Bj^, . . . , ejj 

= 2 (sgn a) a„^ a2i^ . . . a„i^, where a - i^i2 ...in 

Hence D is the determinant function and so the theorem is proved. 

PERMUTATIONS 

8.30. Determine the parity of a = 542163. 

Method 1. 

We need to obtain the number of pairs (i, j) for which i> j and i precedes 3 in a. There are: 
3 numbers (5, 4 and 2) greater than and preceding 1, 

2 numbers (5 and 4) greater than and preceding 2, 

3 numbers (5, 4 and 6) greater than and preceding 3, 
1 number (5) greater than and preceding 4, 

numbers greater than and preceding 5, 
numbers greater than and preceding 6. 

Since 3 + 2+3 + 1 + + = 9 is odd, a is an odd permutation and so sgn a = -1. 

Method 2. 

Transpose 1 to the first position as follows: 

^42163 to 154263 
Transpose 2 to the second position: 

154263 to 125463 
Transpose 3 to the third position: 

125463 to 123546 
Transpose 4 to the fourth position: 

1 2 sT^ 6 to 123456 

Note that 5 and 6 are in the "correct" positions. Count the number of numbers "jumped": 
3 + 2 + 3 + 1 = 9. Since 9 is odd, a is an odd permutation. (Remark: This method is essentially 
the same as the preceding method.) 



CHAP. 8] DETERMINANTS 189 

Method 3. 

An interchange of two. numbers in a permutation is equivalent to multiplying the permutation 
by a transposition. Hence transform a to the identity permutation using transpositions; such as. 



5 4 2^1 6 3 

X 

14 2 5 6 3 

X 

1 2 4 5 6/3 
1 2 S^ 6^4 

X 

12 3 4 6 5 

X 

12 3 4 5 6 
Since an odd number, 5, of transpositions was used, a is an odd permutation. 



8.31. Let a = 24513 and t = 41352 be permutations in Ss. Find (i) the composition per- 
mutations to„ and oroT, (ii) a~^. 

Recall that a = 24513 and t = 41352 are short ways of writing 





_ /I 2 3 4 5\ 

""(^24513/ 


and 


T 


_ /I 2 3 4 5N 
~ V4 1 3 5 2/ 


which means 












<,(1) = 2, <t(2) = 4, <r(3) 


= 5, 


(4) 


= 1 and <r(5) = 3 


and 












r(l) = 4, r(2) = 1, r(3) 


= 3, T 


(4) 


= 5 and t(5) = 2 


(i) 


12 3 4 5 






12 3 4 5 




» 1 4, J, j i 






^/ ^/ ^^ ^^ ^r 




2 4 5 13 


and 




4 13 5 2 




T j 4 1 1 1 






.. i i i i i 




15 2 4 3 






12 5 3 4 


Thus to<t = 


15243 and a°T = 12534. 








(ii) 


Vl 2 3 


1 3\ 

4 5/ 


= 


/I 2 3 4 5\ 
V4 1 5 2 3/ 


That is, ff-i 


= 41523. 









8.32. Consider any permutation <7 = jiji . . . jn. Show that for each pair (i, k) such that 

i> k and i precedes kin a 

there is a pair (i*, k*) such that 

i* < k* and cr(i*) > <r(fc*) (i) 

and vice versa. Thus cr is even or odd according as to whether there is an even or 
odd number of pairs satisfying (1). 

Choose i* and A;* so that a(i*) = i and a{k*) = fc. Then i > fe if and only if <r(i*) > a(fe*), 
and i precedes k in a it and only if i* < k*. 



190 DETERMINANTS [CHAP. 8 

8.33. Consider the polynomial g - g{xi, ...,Xn) = y[{Xi-Xj). Write out explicitly the 
polynomial g — g{xi, X2, Xs, xt). '"^^ 

The symbol 11 is used for a product of terms in the same way that the symbol 2 is used for a 

sum of terms. That is, Yl (a'i ~ ""j) means the product of all terms (Kj — Xj) for which i < j. Hence 
i<i 

9 - g{Xl,...,Xi) = iXi — X2){Xi — Xa)(Xi — Xi)(X2-X3)(X2-Xi)(X3 — X^) 



8.34. Let (T be an arbitrary permutation. For the polynomial g in the preceding problem, 
define <T{g) = Yl (^-^c" ~ ^(tw)- Show that 

'i<3 

{ g if o- is even 
[—g it a IS odd 
Accordingly, aig) = (sgn <j)g. 
Since a is one-one and onto, 

a{g) = n (a'.ra) - a'.ru)) = . .11. ,{xi-Xj) 

to" KjorOj 

Thus a{g) — g or a(g) = —g according as to whether there is an even or an odd number of terms 
of the form («; - Xj) where i > j. Note that for each pair (i, j) for which 

i < } and 17(1) > <r{j) (1) 

there is a term (ajo-fj, - x^^j-^) in a(g) for which a(i) > a(j). Since a is even if and only if there is an 
even number of pairs satisfying (1), we have a(g) = g it and only if a is even; hence <r{g) = -g 
if and only if a is odd. 



8.35. Let u,tG Sn. Show that sgn (to a) = (sgn T)(sgn <t). Thus the product of two even 
or two odd permutations is even, and the product of an odd and an even permutation 
is odd. 

Using the preceding problem, we have 

sgXi-{r °a)g = {T°c)(g) = T(CT(flr)) = T((sgn <r)sr) = (sgn T)(sgn <T)sf 

Accordingly, sgn(TO<r) = (sgn T)(sgn <r). 



8.36. Consider the permutation a = J1J2 . . . jn. Show that sgn (7-1 = sgn <t and, for scalars 
" ttj^i aj^2 . . . ttj^n = aik^chk^ ■ • ■ Omk^ where <7~^ = kiki . . . kn 

We have a-^°<T = c, the identity permutation. Since e is even, cri and a are both even or 
both odd. Hence sgn <r-i = sgn a. 

Since a - JiJz ■■■in is a permutation, aj^i Oj^a • • • a,j^n = "iki "■zkz ■ ■ ■ »nk„- Then fej, k^, ■..,k„ 
have the property that 

ff(fci) = 1, ^(fea) = 2, . . ., "(K) = n 

Let T = ^1^2 ■ ■ ■ kn- Then for i = 1, . . . , w, 

(<j°T)(i) = a{T(i)) = a(fcj) = i 

Thus a°T — e, the identity permutation; hence t = ct~i. 



CHAP. 8] 



DETERMINANTS 



191 



MISCELLANEOUS PROBLEMS 

8.37. Find det (T) for each linear operator T: 
(i) T is the operator on R^ defined by 

T{x, y, z) — {2x — z,x + 2y- 4:Z, Sx-3y + z) 

(ii) T is the operator on the vector space V of 2-square matrices over K defined by 

c d, 

1 2 0-1' 
(i) Find the matrix representation of T relative to, say, the usual basis: [r] = I 1 2—4 



T(A) = MA where M 



1 3 -3 1/ 



Then 



det (T) = 



2 0-1 
12-4 
3-3 1 



2(2-12) - l(-3-6) = 



(ii) Find a matrix representation of T in some basis of V, say, 



Then 



T(Ei) = 

T(E^) = 

nE^) = 

T{Ei) = 



1 




E. = 



1 




£, = 







E^ = 



-11 







a b\/l 

c dj\0 

a 6\/0 1 

c dj\0 

a b\/0 

c d)\i 

a b\/0 

e djio 1 



a 
c 

a 
c 

'b 
^d 

b 
d 



^1 0/' "■* VO 1, 
= aEi + 0^2 + cE^ + OE4 
= 0iS7i + aE^ + OS3 + cE^ 
= bEi + OE2 + dEs + OEi 
= 0^1 + bE2 + OE3 + dEi 



Thus [T]e = 



det (T) = 



'a c 0\ 

a c 

b d 

^0 b dj 

a c 

a c 

6 d 

6 d 



and 





a c 




a e 


= a 


d 


+ c 


6 




b d 




b d 



- a^d^ + bH"^ 



2abcd 



/111 

8.38. Find the inverse of A = ( 1 1 

.0 1, 

The inverse of A is of the form (Problem 8.53): A-» = 1 z 
Set AA-'^ — l, the identity matrix: * ' 



Ai4-i = 



Set corresponding entries equal to each other to obtain the system 

a;-|-l = 0, 2/-t-z-|-l = 0, z + l = 





= / 



192 DETERMINANTS [CHAP. 8 

/l-l 0\ 

The solution of the system is a; = —1, y = 0, z = —1. Hence A ^ — \ 1 — 1 . 

\0 1/ 
A~^ could also be found by the formula A-> = (adj i4.)/|A| . 

8.39. Let D be a 2-linear, alternating function. Show that D{A, B) = - D{B, A). More 
generally, show that if D is multilinear and alternating, then 

D{...,A, ...,B, ...) = -D{...,B, ...,A, ...) 
that is, the sign is changed whenever two components are interchanged. 
Since D is alternating, D{A + B,A+B) = 0. Furthermore, since D is multilinear, 
= D{A+B,A+B) = D(A,A + B) + D(B,A + B) 

= D{A, A) + D(A, B) + D(B, A) + D{B, B) 
But D(A, A) = and I>(B, B) = 0. Hence 

= D(A, B) + D(B, A) or D{A, B) = - D(B, A) 

Similarly, = D(. . ., A + B, . . ., A+ B, . . .) 

= D(,...,A A,...) + D(....,A, ...,B, ...) 

+ D(...,B,...,A, ...) + D{...,B....,B, ...) 
= D(...,A, ...,B, ...) + D(...,B, ...,A, ...) 
and thus D(...,A B,...) = -D(...,B A,...). 

8.40. Let V be the vector space of 2 by 2 matrices M = ( , j over R. Determine 

whether or not D.V^R is 2-linear (with respect to the rows) if (i) D{M) = a + d, 
(ii) D{M) = ad. 

(i) No. For example, suppose A = (1, 1) and B - (3, 3). Then 

D(A,B) = d(\ 3) = 4 and D(2A.B) - C^^ 3) = 5 # 2D(A, B) 

(ii) Yes. Let A = (tti, a^, B = (61, 63) and C = (cj, Ca); then 

D(A,C) = dT' "^) = 01C2 and D(B,C) = ■d(^' ^') = V2 

Hence for any scalars s, < G R, 

D(sA + tB,C) = i?/«»i + **^ sa,+ tb,^^ ^ ^sa, + th,)c, 

= s(aiC2) + t(6iC2) = 8D(A, C) + tD{B, C) 

That is, D is linear with respect to the first row. 
Furthermore, 

D(C,A) = d(^; I'J = c,a, and D(C, B) = D^^^ ^ = c,b. 

Hence for any scalars s, t S R, 

D(C,sA + tB) ^ d( "J-., „?..)= Ci(sa2+t62) 

= s(cia2) + t(ci62) = sI>(C,A) + «D(C,B) 

That is, D is linear with respect to the second row. 
Both linearity conditions imply that D is 2-linear. 



CHAP. 8] 



DETERMINANTS 



193 



Supplementary Problems 

COMPUTATION OF DETERMINANTS 



8.41. Evaluate the determinant of each matrix: (i) 



8.42. Compute the determinant of each matrix: (i) 



2 5 
4 1 

t-2 



(ii) 



6 1 
3 -2 



-3 \ .... /t-5 7 
4 t-l)' <"> -1 ( + 3 



8.43. For each matrix in the preceding problem, find those values of t for which the determinant is zero. 



8.44. Compute the determinant of each matrix: 

/2 1 l\ /3 -2 -4\ 

(i) 5 -2 , (ii) 2 5 -1 , (iii) 
\1 -3 4/ \0 6 1/ 

8.45. Evaluate the determinant of each matrix: 

(i) 



/-2 -1 4 






f - 1 3 -3 

(ii) I -3 t + 5 -3 

-6 6 t-4,1 




8.46. For each matrix in the preceding problem, determine those values of t for which the determinant 
is zero. 



8.47. Evaluate the determinant of each matrix: (i) 



/l 2 2 3 
10-20 



\4 -3 2) 



COF ACTORS, CLASSICAL ADJOINTS, INVERSES 

'l 2-2 3^ 

8.48. For the matrix I „ „ I, find the cof actor of: 

4 2 1 

,1 7 2 -3^ 
(i) the entry 4, (ii) the entry 5, (iii) the entry 7. 



8.49. Let A 




Find (i) adj A, (ii) A-i. 



2 1 3 2\ 

3 1-2 



3-1 1-21' ^"^ I 1 -1 4 3 



l2 2 -1 1/ 



'1 2 2' 
8.50. Let A = f 3 1 
a 1 1, 



Find (i) adj A, (ii) A-i. 



8.51. Find the classical adjoint of each matrix in Problem 8.47. 

8.52. Determine the general 2 by 2 matrix A for which A = adj A. 



8.53. Suppose A is diagonal and B is triangular; say, 

' di ... 
02 ... 



A = 



and B 



,0 



62 
lO 



K j 



194 DETERMINANTS [CHAP. 8 

(i) Show that adj A is diagonal and adj B is triangular. 

(ii) Show that B is invertible iff all 6; ¥= 0; hence A is invertible iff all «{ ^ 0. 

(iii) Show that the inverses of A and B (if either exists) are of the form 

t-i ... \ /&r* di2 ■•• dm 

62 ... d2n 



A-1 = |0 ar^ ••■ I 5_, ^ 



lO ... a- ' 



\o 



That is, the diagonal elements of A-i and B-^ are the inverses of the corresponding diagonal 
elements of A and B. 

DETERMINANT OF A LINEAR OPERATOR 

8.54. Let T be the linear operator on R^ defined by 

T{x, y, z) = (3a; - 2z, 5y + 7z,x + y + z) 
Find det (T). 

8.55. Let DiV^V be the differential operator, i.e. D{v) = dv/dt. Find det (D) if V is the space gen- 
erated by (i) {l,t, .... t"}, (ii) {e*, e^*, e^t}, (iii) {sin t, cos t}. 

8.56. Prove Theorem 8.12: Let T and S be linear operators on V. Then: 

(i) det (S°T) = det(S)'det(r); (ii) T is invertible if and only if det (7) 9^ 0. 

8.57. Show that: (i)det(lv) = l where ly is the identity operator; (ii) det (T-i) = det(r)-i if T is 
invertible. 

DETERMINANTS AND LINEAR EQUATIONS 

,.. (Sx + 5y = 8 ,.., f2x-Sy = -1 

8.58. Solve by determinants: (1) i , „ , (") i . , „ . • 

[ix — 2y = l I4x + 7y = —1 

(2x-5y + 2z = 7 (2z + 3 = y + Sx 

8.59. Solve by determinants: (i) } x + 2y - Az = 3 (ii) ■< x - Sz = 2y + 1. 

[sx-iy-Gz = 5 [sy + z = 2 - 2x 

8.60. Prove Theorem 8.10: The homogeneous system Ax = has a nonzero solution if and only if 

A = IA| == 0. 

PERMUTATIONS 

8.61. Determine the parity of these permutations in Sg: (i) a = 3 2 1 5 4, (ii) r = 1 3 5 2 4, (iii) y = 4 2 5 3 1. 

8.62. For the permutations <r, t and v in Problem 8.61, find (i) r°c, (ii) Tr°<r, (iii) cr-i, (iv) t-i. 

8.63. Let T e 5„. Show that t°<t runs through S„ as a runs through S„; that is, S„ = {t « a : a e iSf„}. 

8.64. Let aeS„ have the property that <t(w) = n. Let a* e S„_i be defined by <f*(x) = <r(x). (i) Show 
that sgn <r* = sgn a. (ii) Show that as a runs through S„, where cr{n) = n, c* runs through 
S„_i; that is, S„_i = {a* : <r e S„, <T(m) = n}. 

MULTILINEARITY 

8.65. Let V = (K"")"", i.e. V is the space of m-square matrices viewed as m-tuples of row vectors. Let 
D:V-^K. 

(i) Show that the following weaker statement is equivalent to D being alternating: 

i?(Ai,A2, ...,A„) = 

whenever Aj = Ai+i for some i. 
(ii) Suppose D is m-linear and alternating. Show that if A^.Az, ■■■,A^ are linearly dependent, 

then D(Ai,...,AJ = 0. 



CHAP. 8] DETERMINANTS 195 

8.66. Let V be the space of 2 by 2 matrices M = ( j over R. Determine whether or not 

D:V->^K is 2-linear (with respect to the rows) if (i) D{M) = ac-hd, (ii) D{M) = ah- ed, 
(iii) D{M) = 0, (iv) D{M) = 1. 

8.67. Let V be the space of M-square matrices over K. Suppose B B V is invertible and so det (B) ¥= 0. 
Define Z> : V -* X by Z?(A) = det (Afi)/det (B) where A G V. Hence 

fl(Ai, Ag, . . . , A„) = det (AiB, A^B, .... A„B)/det (B) 

where Aj is the tth row of A and so A^B is the ith row of AB. Show that Z> is multilinear and 
alternating, and that !>(/) = 1. (Thus by Theorem 8.13, D{A) - det (A) and so det (AB) = 
det (A) det (B). This method is used by some texts to prove Theorem 8.5, i.e. |Ai5| = \A\ |B|.) 

MISCELLANEOUS PROBLEMS 

8.68. Let A be an w-square matrix. Prove IfeAl = fc" \A\ . 



8.69. Prove: 



1 x^ x\ ... x\~^ 

'2 ■ ■ ■ X^ 



1 X2 xf ~"-l 






1 a:„ xl ... «»-» 
The above is called the Vandermonde determinant of order n. 

8.70. Consider the block matrix M = f j where A and C are square matrices. Prove \M\ = \A\ \C\. 

More generally, prove that if M is a triangular block matrix with square matrices Aj, . . ., A^ on 
the diagonal, then \M\ = |Ai| [Agl • • "l^ml- 

8.71. Let A, B, C and D be commuting m-square matrices. Consider the 2»-square block matrix 
^ " (c d)- P'^^^^tl'** W = \A\\D\-\B\\C\. 

8.72. Suppose A is ortfeoflroMttZ, that is, A^A = I. Show that |A| = ±1. 

8.73. Consider a permutation a = Jiij • • • in- Let {e<} be the usual basis of X», and let A be the matrix 
whose tth row is e^., i.e. A = (e,-^, e^-^, . . ., e^^). Show that |A| = sgn a. 

8.74. Let A be an M-square matrix. The determinantal rank of A is the order of the largest submatrix 
of A (obtained by deleting rows and columns of A) whose determinant is not zero. Show that the 
determinantal rank of A is equal to its rank, i.e. the maximum number of linearly independent rows 
(or columns). 



Answers to Supplementary Problems 

8.41. (i) -18, (ii) -15. 

8.42. (i) t2 - 3t - 10, (ii) t^ - 2t - 8. 

8.43. (i) t = 5, t- -2; (ii) t - i, t - -2. 

8.44. (i) 21, (ii) -11, (iii) 100, (iv) 0. 



196 DETERMINANTS [CHAP. 8 

8.45. (i) (< + 2)(t-3)(t-4), (li) (t + 2)2(t-4), (iii) (t + 2)2(f-4). 

8.46. (i) 3, 4, -2; (ii) 4, -2; (iii) 4, -2. 

8.47. (i) -131, (ii) -55. 

8.48. (i) -135, (ii) -103, (iii) -31. 

8.49. adj A = -1 1 -1 , A-i = (adj A)/|A1 = i -^ | 
\ 2 -2 0/ \-l 1 0, 

1 -2\ /-I 2\ 

8.50. adj A = | -3 -1 6 , A-i =31 

2 1 -5/ \-2 -1 5/ 

(-16 -29 -26 -2\ / 21 -14 -17 -19^ 

-30 -38 -16 29 1 ,... I -44 H 33 11 

-8 51 -13 -1 (") -29 1 13 21 

-13 1 28 -18/ \ 17 7 -19 -18; 

/k 
8.52. A = ( „ , 

8.54. det(r) = 4. 

8.55. (i) 0, (ii) 6, (iii) 1. 

8.58. (i) X = 21/26, y = 29/26; (ii) x = -5/13, y = 1/13. 

8.59. (i) X = 5, y — 1, z = 1. (ii) Since A = 0, the system cannot be solved by determinants. 

8.61. agn a = 1, agn t = —1, sgn v = —1. 

8.62. (i) T°v = 53142, (ii) ir°(r = 52413, (iii) <r-i = 32154, (iv) t-i = 14253. 
8.66. (i) Yes, (ii) No, (iii) Yes, (iv) No. 



chapter 9 



Eigenvalues and Eigenvectors 

INTRODUCTION 

In this chapter we investigate the theory of a single linear operator T on a vector 
space V of finite dimension. In particular, we find conditions under which T is diago- 
nalizable. As was seen in Chapter 7, this question is closely related to the theory of 
similarity transformations for matrices. 

We shall also associate certain polynomials with an operator T: its characteristic 
polynomial and its minimum polynomial. These polynomials and their roots play a major 
role in the investigation of T. We comment that the particular field K also plays an im- 
portant part in the theory since the existence of roots of a polynomial depends on K. 

POLYNOMIALS OF MATRICES AND LINEAR OPERATORS 

Consider a polynomial f{t) over a field K: f(t) = Ont" + • - • + ait + Oo. If A is a square 
matrix over K, then we define 

/(A) = a„A" + • • • + ttiA + ao/ 

where / is the identity matrix. In particular, we say that A is a root or zero of the poly- 
nomial fit) if /(A) = 0. 

Example 9.1: Let A = ( V and let /(t) = 2t2 - 3t + 7, g(t) = t^ - 5t - 2. Then 



^<l 



«^)-i:!r-<3:)-a;) = a^;:: 



1 2V -/I 2\ „/l 0\ /O 



and .(A) = ^3 J -5^3 ^^-2^^ ^^ ^^ ^ 

Thus A is a zero of g(t). 
The following theorem applies. 

Theorem 9.1: Let / and g be polynomials over K, and let A be an w-square matrix over K. 
Then 

(i) (/ + flr)(A) = /(A)+flr(A) 

(ii) {fg)(A) = f(A)giA) 

and, for any scalar k G K, 
(iii) {kf){A) = kf{A) 

Furthermore, since f{t) g{t) = g{t) f{t) for any polynomials f{t) and g{t), 

f{A)g{A) = giA)f{A) 
That is, any two polynomials in the matrix A commute. 

197 



198 



EIGENVALUES AND EIGENVECTORS 



[CHAP. 9 



Now suppose T :V -^ V is a linear operator on a vector space V over K. If f{t) — 
cut" + ■ • • + ait + do, then we define f{T) in the same way as we did for matrices: 

f{T) = a^T" + ■•■ +aiT + aoI 

where / is now the identity mapping. We also say that T is a zero or root of f(t) if f{T) = 0. 
We remark that the relations in Theorem 9.1 hold for operators as they do for matrices; 
hence any two polynomials in T commute. 

Furthermore, if A is a matrix representation of T, then /(A) is the matrix representation 
of f(T). In particular, f{T) = if and only if /(A) = 0. 

EIGENVALUES AND EIGENVECTORS 

Let T -.V -*V be a linear operator on a vector space V over a field K. A scalar X G K 
is called an eigenvalue of T if there exists a nonzero vector v eV for which 

T{v) = \v 

Every vector satisfying this relation is then called an eigenvector of T belonging to the 
eigenvalue A. Note that each scalar multiple kv is such an eigenvector: 

T{kv) = kT{v) = k(\v) = \{kv) 

The set of all such vectors is a subspace of V (Problem 9.6) called the eigenspace of \. 

The terms characteristic value and characteristic vector (or: proper value and proper 
vector) are frequently used instead of eigenvalue and eigenvector. 

Example 9.2: Let I.V^V be the identity mapping. Then, for every vGV, I{v) = v = Iv. 
Hence 1 is an eigenvalue of /, and every vector in V is an eigenvector belonging to 1. 



Example 9.3 : Let 7" : R^ ^ R2 be the linear operator which 
rotates each vector v GB? by an angle 
= 90°. Note that no nonzero vector is a 
multiple of itself. Hence T has no eigen- 
values and so no eigenvectors. 

Example 9.4: Let D be the differential operator on the vector 
space V of diflferentiable functions. We have 
£)(e*') = 5e^*. Hence 5 is an eigenvalue of D 
with eigenvector e^'. 

If A is an w-square matrix over K, then an eigenvalue of A means an eigenvalue of A 
viewed as an operator on K". That is, \gK is an eigenvalue of A if, for some nonzero 

(column) vector v G X", 

Av = \v 




In this case v is an eigenvector of A belonging to A. 



such that AX = tX: 



Example 9.5: Find eigenvalues and associated nonzero eigenvectors of the matrix A = 
We seek a scalar t and a nonzero vector X = ( \ 

i DO = •(: 

The above matrix equation is equivalent to the homogeneous system 

X + 2y = tx ( {t-l)x- 2y = 

, or -s 

[3x + 2y = ty \-Zx + (t-2)y = 



1 2 
3 2 



(i) 



CHAP. 9] EIGENVALUES AND EIGENVECTORS 199 

Recall that the homogeneous system has a nonzero solution if and only if the de- 
terminant of the matrix of coefficients is 0: 



t-1 -2 
-3 «- 2 



t2 - 3t - 4 = {t-4){t+l) = 



Thus t is an eigenvalue of A if and only if t = 4 or t = —1. 

Setting t = 4 in (1), 

3x ~ 2y = . , 

or simply 3x — 2y = 
-Sx + 2y = 

Thus V = { ) = ( „ ) is a nonzero eigenvector belonging to the eigenvalue t = 4, 

\V/ \3/ 
and every other eigenvector belonging to < = 4 is a multiple of v. 

Setting t = -l in (1), 

-2x - 2y = . , 

or simply x + y = 
-3x — 3y = 

Thus w = I ) — { 1 ) is a nonzero eigenvector belonging to the eigenvalue 
t = —1, and every other eigenvector belonging to t = —1 is a multiple of w. 

The next theorem gives an important characterization of eigenvalues which is fre- 
quently used as its definition. 

Theorem 9.2: Let T:V -^V be a linear operator on a vector space over K. Then XGK 
is an eigenvalue of T if and only if the operator Xl — T is singular. The 
eigenspace of A is then the kernel of XI — T. 

Proof. X is an eigenvalue of T if and only if there exists a nonzero vector v such that 

T{v) = XV or {Xl){v)-T{v) = or {Xl-T){v) = 

i.e. Xl — T is singular. We also have that v is in the eigenspace of X if and only if the above 
relations hold; hence v is in the kernel of XI — T. 

We now sta+2 a very useful theorem which we prove (Problem 9.14) by induction: 

Theorem 9.3: Nonzero eigenvectors belonging to distinct eigenvalues are linearly 
independent. 

Example 9.6: Consider the functions e°i', 6°^', . . ., e''n' where aj, . . .,a„ are distinct real numbers. 
If D is the differential operator then D(e°i'') = a^e'^''*. Accordingly, e^i', ..., e°»' 
are eigenvectors of D belonging to the distinct eigenvalues ai, . . . , a„, and so, by 
Theorem 9.3, are linearly independent. 

We remark that independent eigenvectors can belong to the same eigenvalue (see 
Problem 9.7). 

DIAGONALIZATION AND EIGENVECTORS 

Let T:V -^ V be a linear operator on a vector space V with finite dimension n. Note 
that T can be represented by a diagonal matrix 

'A;i ... 
k2 ... 

,0 ... kni 



200 EIGENVALUES AND EIGENVECTORS [CHAP. 9 

if and only if there exists a basis [vi, . . .,v„} of V for which 

T{vi) = kivi 
T{v2) — kiVi 



T{V„) - knVn 

that is, such that the vectors vi, ■ ■ .,Vn are eigenvectors of T belonging respectively to eigen- 
values ki, . . ., fen. In other words: 

Theorem 9.4: A linear operator T : V -» V can be represented by a diagonal matrix B 
if and only if V has a basis consisting of eigenvectors of T. In this case 
the diagonal elements of B are the corresponding eigenvalues. 

We have the following equivalent statement. 

Alternate Form of Theorem 9.4: An w-square matrix A is similar to a diagonal matrix 

B if and only if A has n linearly independent eigen- 
vectors. In this case the diagonal elements of B are the 
corresponding eigenvalues. 

In the above theorem, if we let P be the matrix whose columns are the n independent 
eigenvectors of A, then B = P~^AP. 

Example 9.7: Consider the matrix A = ( j . By Example 9.5, A has two independent 

/2\ / 1\ /2 1\ „_, /1/5 1/5 

eigenvectors (^^j and (^_J. Set P = {^^ _^j , and so P i - (^^^^ _^^^ 

Then A is similar to the diagonal matrix 
B = P-^AP = 



/1/5 l/5\/l 2\/2 1\ ^ /4 0\ 
1^3/5 -2/5/^3 2/1^3 -1/ ~ \0 -l) 



As expected, the diagonal elements 4 and —1 of the diagonal matrix B are the eigen- 
values corresponding to the given eigenvectors. 



CHARACTERISTIC POLYNOMIAL, CAYLEY-HAMILTON THEOREM 

Consider an %-square matrix A over a field K: 

(0,11 ai2 . . . ain 
Chl ffl22 . . . fflZn 

ttnl ttre2 . • ■ flnn 



The matrix tin — A, where /„ is the n-square identity matrix and * is an indeterminant, is 
called the characteristic matrix of A: 

(t — an — ffli2 . . . —ain 
~tt21 t — a22 . . . —a2n 

— ftnl — ttn2 ... t aim 

Its determinant AA{t) — det {tin — A) 

which is a polynomial in t, is called the characteristic polynomial of A. We also call 

AA{t) = det (tin -A) = 
the characteristic equation of A. 



CHAP. 9] EIGENVALUES AND EIGENVECTORS 201 

Now each term in the determinant contains one and only one entry from each row and 
from each column; hence the above characteristic polynomial is of the form 

AA{t) - {t- an){t - 022) ••■(*- ann) 

+ terms with at most n — 2 factors of the form t — an 
Accordingly, 

AaC*) = t" — (au + a22+ • • • +aTO)t"~^ + terms of lower degree 

Recall that the trace of A is the sum of its diagonal elements. Thus the characteristic 
polynomial Aa(*) = det (i/„ — A) of A is a monic polynomial of degree n, and the coefficient 
of i"~^ is the negative of the trace of A. (A polynomial is monic if its leading coefficient is 1.) 

Furthermore, if we set i = in Aa(<), we obtain 

Aa(0) = \-A\ = (-1)"[A| 

But Aa(0) is the constant term of the polynomial AaC*). Thus the constant term of the char- 
acteristic polynomial of the matrix A is (—1)" \A\ where n is the order of A. 

13 0^ 
Example 9.8: The characteristic polynomial of the matrix A = ( — 2 2—1 

4 -21 



^(t) = \tI-A\ = 



* - 1 -3 

2 t-2 1 

-4 t + 2 



= «3 - <2 + 2« + 28 



As expected, A(t) is a monic polynomial of degree 3. 

We now state one of the most important theorems in linear algebra. 
Cayley-Hamilton Theorem 9.5: Every matrix is a zero of its characteristic polynomial. 

/I 2^ 
Example 9.9: The characteristic polynomial of the matrix A = I 

V o Ji 



A(t) = \tI-A\ = 



t- 1 -2 
-3 t-2 



= t2 - 3t 



As expected from the Cayley-Hamilton theorem, A is a zero of A{t): 

'<^'^G iJ-'Q i)-'Q °) = (: :) 

The next theorem shows the intimate relationship between characteristic polynomials 
and eigenvalues. 

Theorem 9.6: Let A be an n-square matrix over a field K. A scalar \GK is an eigen- 
value of A if and only if A is a root of the characteristic polynomial A(t) of A. 

Proof. By Theorem 9.2, A is an eigenvalue of A if and only if XI — A is singular. 
Furthermore, by Theorem 8.4, XI — A is singular if and only if |a7 — A| = 0, i.e. A is a root 
of A(t). Thus the theorem is proved. 

Using Theorems 9.3, 9.4 and 9.6, we obtain 

Corollary 9.7: If the characteristic polynomial A(t) of an ^-square matrix A is a product 
of distinct linear factors: 



202 EIGENVALUES AND EIGENVECTORS [CHAP. 9 

A{t) = {t- ai){t - aa) • • • (t - an) 

i.e. if ai, . . . , On are distinct roots of A{t), then A is similar to a diagonal 
matrix whose diagonal elements are the ok. 

Furthermore, using the Fundamental Theorem of Algebra (every polynomial over C 
has a root) and the above theorem, we obtain 

Corollary 9.8: Let A be an w-square matrix over the complex field C. Then A has at 
least one eigenvalue. 




Example 9.10: Let A = 2 — 5 . Its characteristic polynomial is 



A(t) = 



We consider two cases: 



t - 3 

«- 2 5 
-1 t + 2 



= («-3)(t2 + l) 



(i) A is a matrix over the real field R. Then A has only the one eigenvalue 3. 
Since 3 has only one independent eigenvector, A is not diagonalizable. 

(ii) A is a matrix over the complex field C. Then A has three distinct eigenvalues: 
3, i and —i. Thus there exists an invertible matrix P over the complex field C 
for which 

/3 0^ 
P-iAP = i 
\0 -i, 
i.e. A is diagonalizable. 

Now suppose A and B are similar matrices, say B = P^^AP where P is invertible. We 
show that A and B have the same characteristic polynomial. Using tl — P~^tIP, 

\tI-B\ = \tI-P-'AP\ = \P-HIP - P-'AP\ 

= \P-mi-A)P\ =■ \P-^\\tI-A\\P\ 

Since determinants are scalars and commute, and since |P~i| |P| = 1, we finally obtain 

\tI-B\ = \tI-A\ 
Thus we have proved 
Theorem 9.9: Similar matrices have the same characteristic polynomial. 

MINIMUM POLYNOMIAL 

Let A be an w-square matrix over a field K. Observe that there are nonzero polynomials 
f{t) for which f{A) — 0; for example, the characteristic polynomial of A. Among these 
polynomials we consider those of lowest degree and from them we select one whose leading 
coefficient is 1, i.e. which is monic. Such a polynomial m{t) exists and is unique (Problem 
9.25); we call it the minimum polynomial of A. 

Theorem 9.10: The minimum polynomial m{f) of A divides every polynomial which has A 
as a zero. In particular, m{t) divides the characteristic polynomial A(t) of A. 

There is an even stronger relationship between m{t) and A(i). 



CHAP. 9] 



EIGENVALUES AND EIGENVECTORS 



203 



2 


1 











2 














2 














5 



Theorem 9.11: The characteristic and minimum polynomials of a matrix A have the same 
irreducible factors. 

This theorem does not say that w(i) = A(t); only that any irreducible factor of one must 
divide the other. In particular, since a linear factor is irreducible, m(t) and A(t) have the 
same linear factors; hence they have the same roots. Thus from Theorem 9.6 we obtain 

Theorem 9.12: A scalar A is an eigenvalue for a matrix A if and only if A is a root of the 
minimum poljTiomial of A. 



Example 9.11: Find the minimum polynomial ■>«(*) of the matrix A 



The characteristic polynomial of A is A(t) = |t/- A] = (t- 2)»(t- 5). By 
Theorem 9.11, both t — 2 and t — 5 must be factors of m(t). But by Theorem 9.10, 
vnif) must divide A(t); hence m(i) must be one of the following three polynomials: 

TOi(t) = (t-2)(t-5), W2(t) = (t-2)2(t-5), m^fy = (t-2)3(t-5) 

We know from the Cayley-Hamilton theorem that in^iA) — A(A) = 0. The reader can 
verify that »ni(A) # but WgCA) = 0. Accordingly, m^it) = {t — 2)^ (t — 5) is the 
minimum polynomial of A. 

Example 9.12: Let A be a 3 by 3 matrix over the real field R. We show that A cannot be a zero 
of the polynomial f{t) = t^ + 1. By the Cayley-Hamilton theorem, A is a zero of 
its characteristic poljmomial A(t). Note that A(t) is of degree 3; hence it has at least 
one real root. 

Now suppose A is a zero of f{t). Since f(t) is irreducible over R, f{t) must be 
the minimal polynomial of A. But /(t) has no real root. This contradicts the fact 
that the characteristic and minimal polynomials have the same roots. Thus A is not 
a zero of f{t). 

The reader can verify that the following 3 by 3 matrix over the complex 
field C is a zero of f(t): 

fo -1 0^ 
10 
lO 



CHARACTERISTIC AND MINIMUM POLYNOMIALS OF 
LINEAR OPERATORS 

Now suppose T:V^V is a linear operator on a vector space V with finite dimension. 
We define the characteristic polynomial A(<) of T to be the characteristic polynomial of any 
matrix representation of T. By Theorem 9.9, A{t) is independent of the particular basis in 
which the matrix representation is computed. Note that the degree of A{t) is equal to the 
dimension of V. We have theorems for T which are similar to the ones we had for matrices: 

Theorem 9.5': T is a zero of its characteristic polynomial. 

Theorem 9.6': The scalar \GK is an eigenvalue of T if and only if A is a root of the 
characteristic polynomial of T. 

The algebraic multiplicity of an eigenvalue \G K of T is defined to be the multiplicity 
of A as a root of the characteristic polynomial of T. The geometric multiplicity of the 
eigenvalue A is defined to be the dimension of its eigenspace. 

Theorem 9.13: The geometric multiplicity of an eigenvalue A does not exceed its algebraic 
multiplicity. 



204 EIGENVALUES AND EIGENVECTORS [CHAP. 9 

Example 9.13: Let V be the vector space of functions which has {sin e, cos 9} as a basis, and let D 
be the differential operator on V. Then 

D{sme) = cosfl = 0(sin e) + l(cos9) 

Z>(cos e) = — sin 9 = — l(sin e) + O(cos e) 

The matrix A of D in the above basis is therefore A = [D] = ( V Thus 



det (tl -A) = 



t -1 
1 t 



= t2 



and the characteristic polynomial of D is A(t) = t^ + l. 

On the other hand, the minimum polynomial m{t) of the operator T is defined independ- 
ently of the theory of matrices, as the polynomial of lowest degree and leading coefficient 1 
which has T as a zero. However, for any polynomial f{t), 

f{T) = if and only if f{A) = 

where A is any matrix representation of T. Accordingly, T and A have the same minimum 
polynomial. We remark that all the theorems in this chapter on the minimum polynomial 
of a matrix also hold for the minimum polynomial of the operator T. 



Solved Problems 

POLYNOMIALS OF MATRICES AND LINEAR OPERATORS 
9.1. Find f{A) where A = T ~z\ and f{t) ^ f-M + 1. 



9.2. Show that ^ = ( « o ) ^^ ^ ^^^^ ^^ f(^) = *^ - 4t - 5. 

'«-' = --"— a;)'-G :)-<;:) = a: 



9.3. Let V be the vector space of functions which has {sin 6, cos ^} as a basis, and let 
D be the differential operator on V. Show that D is a zero of f{t) = t^ + 1. 

Apply f(D) to each basis vector: 

/(D)(sin9) = (Z>2 + 7)(sin e) = r»2(sin e) + /(sine) = -sin « + sin 9 = 
/(D)(cos e) = {D^ + /)(cos e) = DHcos e) + /(cos e) = -cos 9 + cos e = 

Since each basis vector is mapped into 0, every vector i; S V is also mapped into by f(D). Thus 
fm = 0. 

This result is expected since, by Example 9.13, /(<) is the characteristic polynomial of D. 



CHAP. 9] EIGENVALUES AND EIGENVECTORS 205 

9.4. Let A be a matrix representation of an operator T. Show that /(A) is the matrix 
representation of f{T), for any polynomial f{t). 

Let <t> be the mapping T h* A, i.e. which sends the operator T into its matrix representation A. 
We need to prove that <i>(f(T)) - f(A). Suppose fit) = a„t" -\- • ■ ■ + a^t + a^. The proof is by in- 
duction on n, the degree of fit). 

Suppose TO = 0. Recall that 0(/') = / where /' is the identity mapping and / is the identity 
matrix. Thus 

<t>{f(T)) = <t>(Hr) = ao0(/') = V = /(A) 

and so the theorem holds for n = 0. 

Now assume the theorem holds for polynomials of degree less than n. Then since ^ is an 
algebra isomorphism, 

^if(T)) = 0(tt„r" + a„_irn-i + • • • + aiT + a^') 

= a„0(r) 0(r»-i) + 0(a„_ir"-i + • • • + aiT + a^') 

= o„AA«-i + (a„_jA"-i + • • • + aiA + a<,/) = /(A) 
and the theorem is proved. 



9.5. Prove Theorem 9.1: Let / and g be polynomials over K. Let A be a square matrix 
over K. Then: (i) (/ + fir)(A) = /(A) + flr(A); (ii) {fg){A) = /(A) g{A); and (iii) (fe/)(A) = 
kf(A) where fc G K. 

Suppose / = a„t" + • • • + Oi* + Oq and g = b^t^ + • • • + bit + bf,. Then by definition, 

f(A) = a„A» + • • • + ttiA + Oq/ and ^(A) = ft^A" + • ■ • + biA + bol 

(i) Suppose m — n and let ftj = if i> m. Then 

/ + sr = (a„+6„)t"+ ••• +K + 6i)t + (ao+6o) 

Hence 

(/ + g)(A) = {a„ + 6„)A" +•••+(«! + 6i)A + (a<, + 6o)^ 

= a„A» + 6„A« + • • • + ajA + 6iA + tto^ + 60/ = /(A) + flr(A) 

n + m 

(ii) By definition, fg = c„ +„ t" +>"+•••+ Ci* + Co = 2 c^t'' where c^ = eio^fc + "'i^fc-i + 

fc = 

fc n + m 



+ Ojcfro = 2 aj6fc_i. Hence (/ff)(A) = 2 CfcA^ and 



1=0 



/n N/™ \ nm n + m 

/(A)f,(A) = ( 2 OiAMf 2 6jAi) = 2 2»M*+^ = 2 CfcA" = (/ff)(A) 
\i=o /\j=o / t=0 j=0 fc=0 

(iii) By definition, kf = A;a„t" + • • • + fcajt + fcoo. and so 

(fe/)(A) = fca„A» + • • • + fettiA + fcoo/ = A;(a„A» + • • • + ajA + o,/) = k /(A) 



EIGENVALUES AND EIGENVECTORS 

9.6. Let A, be an eigenvalue of an operator T:V^V. Let Vx denote the set of all eigen- 
vectors of T belonging to the eigenvalue X (called the eigenspace of A.). Show that 
Vx is a subspace of V. 

Suppose v,w & Vx, that is, T(v) = \v and r(w) = \w. Then for any scalars a,b & K, 

T(av + 6w) = a r(i;) + 6 T(w) - aiw) + b{\w) - Mav + bw) 

Thus av + bw is an eigenvector belonging to X, i.e. av + bw G V^. Hence Vx is a subspace of V. 



206 



EIGENVALUES AND EIGENVECTORS 



[CHAP. 9 



9.7. Let A = 



1 4\ 

2 g I . (i) Find all eigenvalues of A and the corresponding eigenvectors. 

(ii) Find an invertible matrix P such that P-^AP is diagonal. 



U) 



{t-5){t+l) 

The roots of A(t) are 5 and —1, and so these numbers are the eigenvalues of A. 

We obtain the eigenvectors belonging to the eigenvalue 5. First substitute t = 5 into 

4 -4- 

5 form the solution of the homogeneous system determined by the above matrix, i.e., 

4 -4\/x\ _ /0\ f 4x-4y = 

.-2 2)\y) ~ \o) ""^ \^2x + 2y = 



(i) Form the characteristic matrix */ 


-AoiA: 




\0 tj \2 3J \ -2 t-3 


The characteristic polynomial A(t) of A is its determinant: 


A(t) = \tI-A\ = 


«- 1 -4 
-2 «- 3 


= «2 _ 4t _ 5 = 



the characteristic matrix (1) to obtain the matrix 



The eigenvectors belonging to 



X — 2/ = 



(In other words, the eigenvectors belonging to 5 form the kernel of the operator tl — A for 
t = 5.) The above system has only one independent solution; for example, x = 1, y = 1. Thus 
■" = (1, 1) is an eigenvector which generates the eigenspace of 5, i.e. every eigenvector belong- 
ing to 5 is a multiple of v. 

We obtain the eigenvectors belonging to the eigenvalue —1. Substitute t = — 1 into {1) 
to obtain the homogeneous system 



-2 -4 
-2 -4 



~2x - 4.y = 
-2x - 4i/ = 



The system has only one independent solution; for example, x 
is an eigenvector which generates the eigenspace of —1. 



or X + 2j/ = 
2, 3/ = -1. Thus w = (2, -1) 



(ii) Let P be the matrix whose columns are the above eigenvectors: P 



B = P-^AP = 



■1 2 
1 -1 
B — P~^AP is the diagonal matrix whose diagonal entries are the respective eigenvalues: 

'1/3 2/3\/l 4\/l 2\ _ /5 
1/3 -1/3 JV 2 sjil -1/ " U -1 



Then 



{Remark. Here P is the transition matrix from the usual basis of R2 to the basis of eigen- 
vectors {v, w}. Hence B is the matrix representation of the operator A in this new basis.) 



9.8. For each matrix, find all eigenvalues and a basis of each eigenspace: 

1 -3 3\ /-3 1 -l' 

(i) A = I 3 -5 3 , (ii) 5 = -7 5-1 



6-6 4 
Which matrix can be diagonalized, and why? 



-6 6 -2 



(1) Form the characteristic matrix tl — A and compute its determinant to obtain the character- 
istic polynomial A(t) of A: 

« - 1 3 -3 

-3 t + 5 -3 = (t + 2)2(t-4) 

-6 6 f - 4 



A(t) = \tI-A\ = 



The roots of A(t) are —2 and 4; hence these numbers are the eigenvalues of A. 



CHAP. 9] 



EIGENVALUES AND EIGENVECTORS 



207 




We find a basis of the eigenspace of the eigenvalue —2. Substitute t = — 2 into the char- 
acteristic matrix tl — A to obtain the homogeneous system 

f-Sa; + 32/ - 32 = 
or i —3a; + 3j/ — 3z = or x — 2/ + z = 
[—6a; + 62/ — 6z == 

The system has two independent solutions, e.g. a; = l, 2/ = l, z = and a; = 1, j/ = 0, z = —1. 
Thus u = (1, 1, 0) and v = (1, 0, —1) are independent eigenvectors which generate the eigen- 
space of —2. That is, u and v form a basis of the eigenspace of —2. This means that every 
eigenvector belonging to —2 is a linear combination of u and v. 

We find a basis of the eigenspace of the eigenvalue 4. Substitute t = 4 into the char- 
acteristic matrix tl — A to obtain the homogeneous system 




3x + By - 


- 3z = 


Sx + 9y - 


- 3z = 


6x + 6y 


= 



X + y — z = 
2y - z = 



The system has only one free variable; hence any particular nonzero solution, e.g. x = 1, y = 1, 
z = 2, generates its solution space. Thus w = (1, 1, 2) is an eigenvector which generates, and 
so forms a basis, of the eigenspace of 4. 

Since A has three linearly independent eigenvectors, A is diagonalizable. In fact, let P 
be the matrix whose columns are the three independent eigenvectors: 




Then P-^AP = 



/-2 


\ 



As expected, the diagonal elements of P~^AP are the eigenvalues of A corresponding to the 
columns of P. 



(ii) 



A(t) = \tI-B\ 



t + 3 -1 
7 t-5 
6 -6 



1 

1 

t + 2 



= (t + 2)2(t-4) 



The eigenvalues of B are therefore —2 and 4. 



We find a basis of the eigenspace of the eigenvalue —2. 
to obtain the homogeneous system 



Substitute t = -2 into tl - B 




X - 


- y + z 


= 





Ix- 


-ly + z 


= 





6a; - 


-6y 


— 






y + z = 
y =0 



The system has only one independent solution, e.g. a; = 1, 2/ = 1, z = 0. Thus u — (1, 1, 0) 
forms a basis of the eigenspace of —2. 

We find a basis of the eigenspace of the eigenvalue 4. Substitute t = 4 into tl —B to 
obtain the homogeneous system 



7a; — 2/ + z = 
a; = 




7a; - 


2/ + 2 


= 





7a; - 


2/ + z 


= 





6a;- 


62/ + 6Z 


= 






The system has only one independent solution, e.g. a; = 0, 2/ = 1, z = 1. Thus v 
forms a basis of the eigenspace of 4. 



(0.1,1) 



Observe that B is not similar to a diagonal matrix since B has only two independent 
eigenvectors. Furthermore, since A can be diagonalized but B cannot, A and B are not similar 
matrices, even though they have the same characteristic polynomial. 



208 



EIGENVALUES AND EIGENVECTORS 



[CHAP. 9 



9.9. Let A = 



(i) 



Mt) = \tI-A\ = 



Find all eigenvalues and the corresponding 



eigenvectors of A and B viewed as matrices over (i) the real field R, (ii) the complex 
field C. 

t- 3 1 
-1 t-1 

Hence only 2 is an eigenvalue. Put t = 2 into tl — A and obtain the homogeneous system 



= (2 



4i 



4 = (t - 2)2 



—X + y = 
—X + y = 



X — y 







The system has only one independent solution, e.g. a; = 1, y = 1. Thus i) = (1, 1) is an eigen- 
vector which generates the eigenspace of 2, i.e. every eigenvector belonging to 2 is a multiple 
of V. 

We also have 



^B(t) = \tl-B\ = 



t- 1 

-2 



1 

t + 1 



= t2 + 1 



Since t2 + i has no solution in R, B has no eigenvalue as a matrix over R. 

(ii) Since A^(t) = (t — 2)2 has only the real root 2, the results are the same as in (i). That is, 
2 is an eigenvalue of A, and v = (1, 1) is an eigenvector which generates the eigenspace of 2, 
i.e. every eigenvector of 2 is a (complex) multiple of v. 

The characteristic matrix of B is Ajj(t) = \tl — B\ = t^ + i. Hence i and —i are the eigen- 
values of B. 

We find the eigenvectors associated with t = i. Substitute t = i in tl — B to obtain the 
homogeneous system 

'i-1 1 \/x\ _ /0\ / {i-l)x + y = 

-2 i+lAy) ~ [oj ""^ \-2x + {i+l)y = ° 



The system has only one independent solution, e.g. x = 1, y — 1 - 
an eigenvector which generates the eigenspace of i. 

Now substitute t = —i into tl — B to obtain the homogeneous system 

'-i — 1 1 \/x\ /0\ { (—i—l)x + y = Q 



(i -l)x + y = 
Thus w = (l,l — i) is 



-2x + {-i - l)y 







(-i -l)x + y = 



The system has only one Independent solution, e.g. x = 1, y = 1 + i. Thus w' = (1,1-1- i) is 
an eigenvector which generates the eigenspace of —i. 



9.10. Find all eigenvalues and a basis of each eigenspace of the operator T : R^ -* R^ defined 

by T{x, y, z) = {2x + y,y- z, 2y + 4«). 

First find a matrix representation of T, say relative to the usual basis of R^: 

/2 1 0^ 

A = m = 

The characteristic polynomial A(*) of T is then 




A(t) = \tI-A\ = 



t-2 -1 
t-1 
-2 




1 

t- 4 



= (t-2)2(t-3) 



Thus 2 and 3 are the eigenvalues of T. 

We find a basis of the eigenspace of the eigenvalue 2. Substitute t = 2 into tl — A to obtain 
the homogeneous system 

/ ™ \ /a\ r «. — n 

y =0 

y + z = 





CHAP. 9] EIGENVALUES AND EIGENVECTORS 209 

The system has only one Independent solution, e.g. x = 1, y = 0, z = 0. Thus u = (1, 0, 0) forms 
a basis of the eigenspace of 2. 

We find a basis of the eigenspace of the eigenvalue 3. Substitute t — 3 into tl — A to obtain 
the homogeneous system 




The system has only one independent solution, e.g. x — 1, y — 1, z = —2. Thus v = (1, 1, —2) 
forms a basis of the eigenspace of 3. 

Observe that T is not diagonalizable, since T has only two linearly independent eigenvectors. 

9.11. Show that is an eigenvalue of T if and only if T is singular. 

We have that is an eigenvalue of T if and only if there exists a nonzero vector v such that 
T{v) = Ov = 0, i.e. that T is singular. 

9.12. Let A and B be w-square matrices. Show that AB and BA have the same eigenvalues. 

By Problem 9.11 and the fact that the product of nonsingular matrices is nonsingular, the fol- 
lowing statements are equivalent: (i) is an eigenvalue of AB, (ii) AB is singular, (iii) A or B is 
singular, (iv) BA is singular, (v) is an eigenvalue of BA. 

Now suppose X is a nonzero eigenvalue of AB. Then there exists a nonzero vector v such that 
ABv = Xv. Set w = Bv. Since \ # and v ¥= 0, 

Aw = ABv = \v ¥= and so w # 

But w is an eigenvector of BA belonging to the eigenvalue X since 

BAw - BABv = B\v = \Bv = \w 

Hence X is an eigenvalue of BA. Similarly, any nonzero eigenvalue of BA is also an eigenvalue 
of AB. 

Thus AB and BA have the same eigenvalues. 



9.13. Suppose A. is an eigenvalue of an invertible operator T. Show that A~* is an eigenvalue 
of T-K 

Since T is invertible, it is also nonsingular; hence by Problem 9.11, X # 0. 

By definition of an eigenvalue, there exists a nonzero vector i; for which T(v) — Xv. Apply- 
ing r-i to both sides, we obtain v = T-i(\v) = xr-i(i;). Hence r-i(v) = X-iv; that is, X"! 
is an eigenvalue of r~i. 

9.14. Prove Theorem 9.3: Let Vi, . . .,Vn be nonzero eigenvectors of an operator T:V ->V 
belonging to distinct eigenvalues Ai, . . . , A„. Then vi, . . .,Vn are linearly independent. 

The proof is by induction on n. Ii n — 1, then Vj is linearly independent since Vi # 0. 
Assume « > 1. Suppose 

OiV, -I- 02^2 + • • • + a„v„ = {1) 

where the Oj are scalars. Applying T to the above relation, we obtain by linearity 

aiT(Vi) + <hT(v.,) + ■■■ + a„T(v„) = T{0) ^ 
But by hypothesis r(i;j) = XjVj; hence 

ajXiVi + 02X2^2 + • • • + ttn^n^n = (2) 



210 



EIGENVALUES AND EIGENVECTORS 



[CHAP. 9 



On the other hand, multiplying (1) by X„, 

Now subtracting (5) from (2), 

ai(Xi - Xj^i + a2{\2~K)'"2 + • • • + a„_i(\„_i - Xjiin-i = 

By induction, each of the above coefficients is 0. Since the Xj are distinct, Xj — X„ 9^ for i # w. 
Hence aj = • • • = a^-i = 0. Substituting this into (1) we get a„i;„ = 0, and hence a„ = 0. Thus 
the Vi are linearly independent. 



CHARACTERISTIC POLYNOMIAL, CAYLEY-HAMILTON THEOREM 

9.15. Consider a triangular matrix 

'an ai2 

a22 



... CUnj 

Find its characteristic polynomial A{t) and its eigenvalues. 

Since A is triangular and tl is diagonal, tl — A is also triangular with diagonal elements t — ««: 

/*-«!! -<»12 • • • 

t 0.22 



-a 



In 







"-in 
t-anni 



tl - A ^ 

\ 

Then A(t) = |t/ — A| is the product of the diagonal elements t — ««: 

A(t) = (t - aii)(t - a22) •••(*- «nn) 

Hence the eigenvalues of A are an, tt22, . . . , ««„, i.e. its diagonal elements 



1 2 3\ 
9.16. Let A = I 2 3 . Is A similar to a diagonal matrix? If so, find one such matrix. 
3/ 
Since A is triangular, the eigenvalues of A are the diagonal elements 1, 2 and 3. Since they 
are distinct, A is similar to a diagonal matrix whose diagonal elements are 1, 2 and 3; for example, 

/i o^ 

2 
\0 3; 



9.17. For each matrix find a polsoiomial having the matrix as a root 

/l 4-3 



« ^ = (i -3> <«)« = (' :4)' <'") ^ = » \\ 



■I 



By the Cayley-Hamilton theorem every matrix is a root of its characteristic polynomial. 
Therefore we find the characteristic polynomial A(t) in each case. 



(i) A(t) = \tI-A\ = 
(ii) A(«) == \tI-B\ = 

(iii) A(t) = \tI-C\ = 



t-2 -B 

-1 f + 3 

t-2 3 

-7 t + 4 



+ t - 11 



= (2 + 2t + 13 



t-1 -4 3 

t - 3 -1 
-2 t+1 



(t - l)(t2 -2t-5) 



CHAP. 9] EIGENVALUES AND EIGENVECTORS 211 

9.18. Prove the Cayley-Hamilton Theorem 9.5: Every matrix is a zero of its characteristic 
polynomial. 

Let A be an arbitrary w-square matrix and let A(i) be its characteristic polynomial; say, 
A(t) = \tI-A\ = <«+ a„_it"-i +••■+«!* + do 

Now let B(t) denote the classical adjoint of the matrix tl — A. The elements of B{t) are cofactors 
of the matrix tl — A and hence are polynomials in t of degree not exceeding n — 1. Thus 

B(t) = B„_it"-i + ■ ■ ■ + B^t + Bo 

where the Bj are re-square matrices over K which are independent of t. By the fundamental property 
of the classical adjoint (Theorem 8.8), 

{tI-A)B{t) - \tI-A\I 

or (i/-i4)(B„_it«-i+ ••• +Bi« + Bo) = (<" + a„_it»-i+ l-ai« + aoK 

Removing parentheses and equating the coefficients of corresponding powers of t, 

Bn-l = 1 

Bn-2~AB„_i — a„-il 



Bo - ABi = a,/ 
-ABo = agl 

Multiplying the above matrix equations by A", A"-^, . . ., A, I respectively, 

A"Bn-i = A" 
An-iB„_2 - A«J?„_i 3^ a„-iA»~i 
A»-2B„_3-A«-iB„_2 = a„_2A"-2 



ABo - A^Bi - a^A 

—ABo = »o^ 
Adding the above matrix equations, 

= A» + a„_iA«-i + • • • + aiA + oo/ 

In other words, A(A) = 0. That is, A is a zero of its characteristic polynomial. 

9.19. Show that a matrix A and its transpose A' have the same characteristic polynomial. 

By the transpose operation, (t/-A)' = tl* — A* - tl - A*. Since a matrix and its transpose 
have the same determinant, \tI — A\ - |(t7-A)*| := \tI-A*\. Hence A and A* have the same char- 
acteristic polynomial. 



/Ai B\ 
9.20. Suppose M = \ ^ ^1 where Ai and Az are square matrices. Show that the char- 
acteristic polynomial of M is the product of the characteristic poljmomials of Ai and 
Ai. Generalize. 



/tl-A, ~B \ 
tl-M = I tl-A/' ^®"*=^ ^y Problem 8.70, \tI-M\ = 

|t/ — A| \tl — B\, as required. 

By induction, the characteristic poljmomial of the triangular block matrix 



tl - Ai -B 
tl - A^ 



212 



EIGENVALUES AND EIGENVECTORS 



[CHAP. 9 



M = 



lA, B 
A2 



C 
D 



\ ... A^l 
where the Aj are square matrices, is the product of the characteristic polynomials of the Aj. 



MINIMUM POLYNOMIAL 

9.21. Find the minimum polynomial m{t) of A = 

The characteristic polynomial of A is 



2 


1 











2 














1 


1 








-2 


4 



A(f) = 



t-2 -1 

t-2 
t- 1 











t-2. -1 
t-2 



t-1 -1 
2 i-4 



= (t-3)(t-2)3 



The minimum polynomial m{t) must divide A(i). Also, each irreducible factor of A(t), i.e. t — 2 
and < — 3, must be a factor of m{t). Thus m(t) is exactly one of the following: 



/(t) = (t-3)(t-2), flr(t) = (i-3)(«-2)2, /i(«) = (t-3)(t-2)3 



We have 



/(A) = (A-37)(A-2/) = 



ff(A) = (A - 37)(A - 27)2 = 



-1 


1 





o\ 


1' 


1 











-1 








'0 

















-2 


1 


\: 





-1 


1 








-2 


1/ 





-2 


2 


1 


1 





o\ 


lo 


1 











-1 


























-2 


1 








-1 


1 



¥- 



= 



0-2 



,0 



Thus g{t) = (t — 3)(t — 2)2 is the minimum polynomial of A. 



Remark. We know that h{A) — A(A) =; by the Cayley-Hamilton theorem. However, the degree 
of g{t) is less than the degn:ee of h(t); hence g(f), and not h(t), is the minimum poljmomial of A. 



9.22. Find the minimal polynomial m{t) of each matrix (where a¥^Oy. 
^ \o */ \o A, 

(i) The characteristic polynomial of A is A(t) - {t — X)2. We find A — \/ # 0; hence m{t) — 
A(t) = (t-\)K 

(ii) The characteristic polynomial of B is A(i) = (t — X)^. (Note m(t) is exactly one of t — \, {t — X)2 
or {t - X)3.) We find (B - X/)2 ¥- 0; thus OT(t) = A(t) = (t - X)S. 

(iii) The characteristic polynomial of C is A(t) = (t - X)*. We find (C - X/)3 # 0; hence m{t) = 
A(t) = (t-X)'«. 



CHAP. 9] 



EIGENVALUES AND EIGENVECTORS 



213 



9.23. 



(A 0\ 
Let M = I Q g 1 where A and B are square matrices. Show that the minimum 

polynomial m(f) of M is the least common multiple of the minimum polynomials g(f) 
and fe(t) of A and B respectively. Generalize. 

"m,(A.) 
m(B)^ 



Since m(V) is the minimum polynomial of M, m(Af) = 



= and hence ■m(A) = 



and mifi) = 0. Since g{t) is the minimum polynomial of A, g{t) divides m{t). Similarly, h{t) 
divides m{t). Thus m(t) is a multiple of g(t) and h(t). 

/f(A) \ /O 0^ 



Now let f{t) be another multiple of g{t) and h(t); then /(M) = 







= 0. 



V /(B), 

But m{t) is the minimum polynomial of M; hence m(t) divides /(t). Thus m{t) is the least common 
multiple of g{t) and /i(t). 

We then have, by induction, that the minimum polynomial of 





M = 







\o 







where the Aj are square matrices, is the least common multiple of the minimum polynomials of 
the A,. 



9.24. Find the minimum polynomial m(i) of 

'2 



M 




Let A 



2 8 
2 



c 



D = (5). The minimum polynomials of 



A, C and D are (t — 2)^, t2 and t — 5 respectively. The characteristic polynomial of B is 



\tI-B\ = 



t- 4: -2 

-1 t-3 



t2 



7t + 10 = (t-2)(t-5) 



and so it is also the minimum polynomial of B. 



'\ 



Observe that M = 



'A 

B 

C 

,0 2?/ 

polynomials of A, B, C and D. Accordingly, m(t) = tm - 2)2(t — 5) 



Thus m(t) is the least common multiple of the minimum 



9.25. Show that the minimum polynomial of a matrix (operator) A exists and is unique. 

By the Cayley-Hamilton theorem, A is a zero of some nonzero polynomial (see also Problem 9.31). 
Let n be the lowest degree for which a polynomial f(t) exists such that /(A) = 0. Dividing f(t) by 
its leading coefficient, we obtain a monic polynomial m(t) of degree n which has A as a zero. Sup- 
pose m'(t) is another monic polynomial of degree n for which m'{A) = 0. Then the difference 
m{t) — m'(t) is a nonzero polynomial of degree less than n which has A as a zero. This contradicts 
the original assumption on n; hence m(t) is a unique minimum polynomial. 



214 EIGENVALUES AND EIGENVECTORS [CHAP. 9 

9.26. Prove Theorem 9.10: The minimum polynomial m(t) of a matrix (operator) A 
divides every polynomial which has A as a zero. In particular, m{t) divides the char- 
acteristic polynomial of A. 

Suppose f(t) is a polynomial for which f{A) = 0. By the division algorithm there exist poly- 
nomials q{t) and r{t) for which f{t) = m{t) q(t) + r(t) and r(t) = or deg r(t) < deg m(t). Sub- 
stituting t = A in this equation, and using that f(A) = and 'm{A) = 0, we obtain r(A) = 0. If 
r{t) ¥= 0, then r(t) is a polynomial of degree less than m(t) which has A as a zero; this contradicts 
the definition of the minimum polynomial. Thus r(t) = and so f{t) = m{t) q(t), i.e. m(t) divides /(*)• 

9.27. Let m{t) be the minimum polynomial of an %-square matrix A. Show that the char- 
acteristic polynomial of A divides (m(i))''. 

Suppose m(t) = f + Cjf-i -|- • • ■ -I- c^-it + c^. Consider the following matrices: 

Bo = / 

Bi = A + cj 

Bo = A^ + CjA + Cg/ 



B^_i = A"--! + CiA'-2 4- • ■ • + c^_i7 



Then B^ = I 

Bi - ABg = cj 
B^-AB^ = C2I 



Also, -AB^^i = C^ - {Ar+CiAr-i+ ■■■ +Cr-iA+CrI) 

= c^ — «i(A) 
= c^I 

Set B{t) = f-i^o + f-^Bi + • • ■ + tBr-2 + Br-1 

Then 

{tI-A)-B(t) = (t'-Bo+t'-'^Bi+ •■• +tBr-i) - (t'-^ABo + tr-^ABi+ ••• +ABr-i) 

= t^Bo+ tr-i{Bi-ABo)+ t'-2(B2-ABi)+ •■■ -f t(B,._i - AB^-a) - AB^-i 

= f/ -1- Cif-l/ + C^f-^I + ••• + Cr-itl + C^ 

= m(t)I 

The determinant of both sides gives \tl — A\ \B{t)\ = \m(t) I\ = (TO(t))". Since \B(t)\ is a polynomial, 
1*7 — A I divides (m(t))"; that is, the characteristic polynomial of A divides (»n(t))". 

9.28. Prove Theorem 9.11: The characteristic polynomial A{t) and the minimum poly- 
nomial m{t) of a matrix A have the same irreducible factors. 

Suppose f{t) is an irreducible polsmomial. If f{t) divides m{t) then, since m(t) divides A(t), f(t) 
divides A(t). On the other hand, if f(t) divides A(t) then, by the preceding problem, /(«) divides 
(m(t))^. But f(t) is irreducible; hence f(t) also divides m{t). Thus m{t) and A(t) have the same 
irreducible factors. 

9.29. Let r be a linear operator on a vector space V of finite dimension. Show that T is 
invertible if and only if the constant term of the minimal (characteristic) polynomial 
of T is not zero. 

Suppose the minimal (characteristic) polynomial of T is f(t) = f + a„_it'— 1 -I- • • • + a^t + a^f. 
Each of the following statements is equivalent to the succeeding one by preceding results: (1) T is 
invertible; (ii) T is nonsingular; (iii) is not an eigenvalue of T; (iv) is not a root of m(t); (v) the 
constant term Uf, is not zero. Thus the theorem is proved. 



CHAP. 9] 



EIGENVALUES AND EIGENVECTORS 



215 



9.30. Suppose dimF = ». Let T:V ^V he an invertible operator. Show that T-* is 
equal to a polynomial in T of degree not exceeding n. 

Let m(t) be the minimal polynomial of T. Then m{t) = f + a^-if— i + • • • + a^t + a^, where 
r — n. Since T is invertible, Kq # 0. We have 



Hence 






m(r) = y + a^_ir'-i + 

+ aJ)T = I and 



r-i = 



OiT + do/ = 
1 



tto 



(Tr-i + ar-i r'-2 + • • • + aj) 



MISCELLANEOUS PROBLEMS 

9.31. Let T be a linear operator on a vector space V of dimension n. Without using the 
Cayley-Hamilton theorem, show that T is a zero of a nonzero polynomial. 

Let N - rfi. Consider the following N-\-l operators on V: I, T,T^, .... T^. Recall that the 
vector space A(V) of operators on V has dimension N — rfi. Thus the above iV + 1 operators are 
linearly dependent. Hence there exist scalars a^, Oj, . . ., aj, for which a^^T^ + •• • + a^T + a^ = 0. 
Accordingly, T is a zero of the polynomial f{t) = a^^t^ + ••• + a^t + Oq. 



9.32. Prove Theorem 9.13: Let A be an eigenvalue of an operator T:V ^V. The geometric 
multiplicity of X does not exceed its algebraic multiplicity. 

Suppose the geometric multiplicity of X is r. Then X contains r linearly independent eigenvectors 
Vi, ...,Vr. Extend the set {dJ to a basis of V: {v^, ...,v^, w^, . . .,Wg}. We have 



T{vi) 


= 


XVi 












1 
1 


'(^2) 


= 






\V2 








\Vr) 


= 








W, 






T(w,) 


= 


aiiVi 


+ 


... + 


airVr + 6nWi + • 


• • + bi,w. 




I 
7 


'(^2) 


= 


a2iVi 


+ 


... + 


a2rVr + b2tWi + ■ 


• + b2s'Ws 




'(W.) 


= 


O'sl'Vl 


+ 


... + 


O'sr'Or + &SIW1 + • 


■ + bss^s 




The matrix of T in the above basis is 












/^ 










{ «ii 


«21 •■• 0-sl\ 








r. 


X 







1 «'12 


»22 • • • «s2 \ 


= (^'l 




M = 


1 . 







X 


l<Hr 




ttgr ■ • • O'sr 


A^ 




» 










1 *" 


621 ... 6,1 


Vo 


5/ 




". 










1 &12 


622 • • • 6r2 






\. 










1 hs 


hs ■•■ 6ss / 




where A = (0.;^)' and B 


= (»«)•• 















By Problem 9.20 the characteristic polynomial of X/,, which is (t — \Y, must divide the char- 
acteristic polynomial of M and hence T. Thus the algebraic multiplicity of X for the operator T is 
at least r, as required. 



9.33. Show that A = 



is not diagonalizable. 



The characteristic polynomial of A is A(t) = (t — 1)^; hence 1 is the only eigenvalue of A. We 
find a basis of the eigenspace of the eigenvalue 1. Substitute t = 1 into the matrix tl — A to obtain 
the homogeneous system 



216 EIGENVALUES AND EIGENVECTORS [CHAP. 9 



o)(^) = (I) «'• { = 







The system has only one independent solution, e.g. x = 1, y = 0. Hence u — (1, 0) forms a basis 
of the eigenspace of 1. 

Since A has at most one independent eigenvector, A cannot be diagonalized. 

9.34. Let F be an extension of a field K. Let A be an w-square matrix over K. Note that 
A may also be viewed as a matrix A over F. Clearly \tl ~ A\ = \tl - A\, that is, A 
and A have the same characteristic polynomial. Show that A and A also have the 
same minimum polynomial. 

Let m{t) and m'(t) be the minimum polynomials of A and A respectively. Now m'{t) divides 
every polynomial over F vifhich has A as a zero. Since m(t) has A as a zero and since m{t) may be 
viewed as a polynomial over F, m'{t) divides m(t). We show now that m(t) divides m'(t). 

Since m'(t) is a polynomial over F which is an extension of K, we may write 
m'(t) = fi(t)hi + /aft) 62 + ■•• + fnit)b„ 

where /i(«) are polynomials over K, and 61, . . . , 6„ belong to F and are linearly independent over K. 

We have 

m'{A) = /i(A)bi + /2(A)62 + ••• + /„(A)6„ = (1) 

Let alP denote the y-entry of /fc(A). The above matrix equation implies that, for each pair {i,j), 

a</'6i + aif 62 + • • ■ + a^f &„ = 

Since the 64 are linearly independent over K and since the a\P S K, every tty''' = 0. Then 

/i(A) = 0, /2(A) = 0, ..., /„(A) = 

Since the /i(t) are polynomials over K which have A as a zero and since m(t) is the minimum poly- 
nomial of A as a matrix over K, m(t) divides each of the /;(«)• Accordingly, by (1), m(t) must also 
divide m'(t). But monic polynomials which divide each other are necessarily equal. That is, 
m(t) = tn'{t), as required. 

9.35. Let {vi, . . . , v„} be a basis of 7. Let T : F -* 7 be an operator for which T{vi) = 0, 
T{V2) = chiVi, T{vs) - aaivi + asiVi, . . ., Tivn) = a„iVi + • • • + an,n-iv„-i. Show that 
T"- 0. 

It suffices to show that 

Ti{Vj) = (*) 

for i — 1, . . .,n. For then it follows that 

THvj) - T»-i(Ti(Vj)) = r"-5(0) = 0, for ^ = 1 n 

and, since {v^, ...,■«„} is a basis, T" = 0. 

We prove {*) by induction on j. The case j = 1 is true by hypothesis. The inductive step 
follows (for j = 2, . . .,n} from 

Ti{Vj) = Ti-HT(Vj)) = TS-HajiVi+ ■■■+aj^j.iVj^i) 
= aj^Ti-Hvi) + ••• +aj.,_irJ-i(Vj_i) 
= ajiO + ■•• + aj,j_iO = 

Remark. Observe that the matrix representation of T in the above basis is triangular with 
diagonal elements 0: 

/O a^i ttsi ... a-ni 
032 .. . a„2 



... a„,„_i 
\0 ... 



CHAP. 9] 



EIGENVALUES AND EIGENVECTORS 



217 



Supplementary Problems 

POLYNOMIALS OF MATRICES AND LINEAR OPERATORS 

9.36. Let f(t) = 2*2 - 5t + 6 and g(t) = t^ - 21^ + t + i. Find f(A), g{A), f(B) and g(B) where 

9.37. Let r:E2^R2 be defined by T(x,y) = {x + y,2.x). Let /(i) = t2 - 2t + 3. ¥\n& f(T)(x,y). 

9.38. Let V be the vector space of polynomials v(x) = ax^ + hx + c. Let D:V ^V be the differential 
operator. Let f{t) = fi + 2t ~ 5. Find f(D){v{x)). 



9.39. Let A 



.0 



Find A2, A3, A". 



'8 12 0^ 
9.40. Let B = I 8 12 

8/ 



Find a real matrix A such that B = As. 



9.41. Consider a diagonal matrix M and a triangular matrix N: 

' «! ... \ 



a. 







and 



AT 



a^ b ... c 
02 ... d 



\ 



M = " "-2 • 

... a J 
Show that, for any polynomial f(t), f(M) and f(N) are of the form 

'/(tti) ... \ //(ai) X 

f(M) = I ^^ f^"'^^ ■■■ and /(AT) = 

... f{aj 




9.42. Consider a block diagonal matrix M and a block triangular matrix N: 

/Ai ... \ /Ai B 



M = 



A2 ... 



\ ... A„, 



and 



N = 



A, 



C 
D 







where the Aj are square matrices. Show that, for any polynomial /(*), f(M) and f{N) are of the form 

'/(Ai) ... \ //(Ai) X ... F \ 

f{M) = I /(A2) ... ^^^ ^(^) ^ I /(A2) ... 

... /(A„)/ 



\ ... /(Aj/ 



9.43. Show that for any square matrix (or operator) A, (P^iAP)" = P-iA"F where P is invertible. More 
generally, show that f{P-^AP) = p-if{A)P for any polynomial f(t). 

9.44. Let f{t) be any polynomial. Show that: (i) /(Af) = (/(A))*; (ii) if A is symmetric, i.e. A* = A, 
then /(A) is symmetric. 

EIGENVALUES AND EIGENVECTORS 

9.45. For each matrix, find all eigenvalues and linearly independent eigenvectors: 

/22\ /42\ /5-1 

« ^ =(1 3)' (") ^ =(3 3)' (-) ^=(13 

Find invertible matrices Pj, Pg and P3 such that P~iAPi, P-^BP^ and P~^CPs are diagonal. 



218 EIGENVALUES AND EIGENVECTORS [CHAP. 9 

9.46. For each matrix, find all eigenvalues and a basis for each eigenspace: 

/3 1 l\ / 1 2 2\ /l 1 0^ 

(i) A := 2 4 2 , (ii) B = 1 2 -1 , (iii) C = 1 

\1 1 3/ \-l 14/ \0 ly 

When possible, find invertible matrices Pj, P2 and P3 such that P~^APi, F^'BPj ^"d Pj^CPs are 
diagonal. 

9.47. Consider A = I ^ ) and B = I ) as matrices over the real field R. Find all eigen- 

Vl 4y \1Z -3/ 

values and linearly independent eigenvectors. 

9.48. Consider A and B in the preceding problem as matrices over the complex field C. Find all eigen- 
values and linearly independent eigenvectors. 

9.49. For each of the following operators T -.K^ -* R^, find all eigenvalues and a basis for each eigen- 
space: (i) T{x,y) = (3x + 3y,x + 5y); (ii) T{x,y) = (y,x); (iii) T(x,y) = (y,-x). 

9.50. For each of the following operators T : R^ -> R3, find all eigenvalues and a basis for each 
eigenspace: (i) T{x, y,z) — (x + y + z, 2y + z, 2y + 3«); (ii) T(,x, y,z) = (x + y,y + z, —2y — z); 
(iii) t\x, y, 2) = (x-y,2x + 3y + 2z, x-^y + 2z). 

9.51. For each of the following matrices over the complex field C, find all eigenvalues and linearly 
independent eigenvectors: 

<"(: ;)• ""(J D- «G:r). «(;:?; 

9.52. Suppose v is an eigenvector of operators S and T. Show that v is also an eigenvector of the operator 
aS + bT where a and 6 are any scalars. 

9.53. Suppose v is an eigenvector of an operator T belonging to the eigenvalue X. Show that for n > 0, 
V is also an eigenvector of T" belonging to \". 

9.54. Suppose X is an eigenvalue of an operator T. Show that /(X) is an eigenvalue of f(T). 

9.55. Show that similar matrices have the same eigenvalues. 

9.56. Show that matrices A and A* have the same eigenvalues. Give an example where A and A' have 
different eigenvectors. 

9.57. Let S and T be linear operators such that ST — TS. Let X be an eigenvalue of T and let W be 
its eigenspace. Show that W is invariant under S, i.e. S(W) C W. 

9.58. Let y be a vector space of finite dimension over the complex field C. Let W # {0} be a subspace 
of V invariant under a linear operator T : V -* V. Show that W contains a nonzero eigenvector of T. 

9.59. Let A be an ii-square matrix over K. Let v^, . . .yV^G K" be linearly independent eigenvectors of 
A belonging to the eigenvalues Xj, . . . , X„ respectively. Let P be the matrix whose columns are the 
vectors Vi,...,v„. Show that P~^AP is the diagonal matrix whose diagonal elements are the 
eigenvalues Xj, . . . , X„. 

CHARACTERISTIC AND MINIMUM POLYNOMIALS 

9.60. For each matrix, find a polynomial for which the matrix is a root: 

'2 3 -2^ 

4 1. 



^^ ^ = (4 D' (") ^ = (3 3)' ^"^ '^='[1 i_i 



CHAP. 9] 



EIGENVALUES AND EIGENVECTORS 



219 



9.61. Consider the w-square matrix 



A = 





Show that /(«) = (t- X)™ is both the characteristic and minimum polynomial of A. 



9.62. Find the characteristic and minimum polynomials of each matrix: 



A = 



/2 5 0\ 

2 

4 2 

3 5 

\0 7/ 



B 



3 


1 














3 

















3 


1 














3 


1 














3 



c = 



IX 0\ 
0X000 
0X00 
0X0 

VO x/ 



1 1 0\ /2 0\ 

9.63. Let A = 2 and B = 2 2 . Show that A and B have different characteristic 

\0 1/ \0 1/ 

polynomials (and so are not similar), but have the same minimum polynomial. Thus nonsimilar 
matrices may have the same minimum polynomial. 

9.64. The mapping T:V -*V defined by T{v) = kv is called the scalar mapping belonging to k & K. 
Show that T is the scalar mapping belonging to k&K if and only if the minimal polynomial of 
T is m(t) = t-k. 

9.65. Let A be an n-square matrix for which A^ = Q for some k> n. Show that A" = 0. 

9.66. Show that a matrix A and its transpose A* have the same minimum polynomial. 

9.67. Suppose f(t) is an irreducible monic polynomial for which f(T) = where T Is a linear operator 
T:V ^V. Show that f(f) is the minimal polynomial of T. 



9.68. Consider a block matrix M = I V Show that tl — M = ( _ fj _ n ) is the char- 

acteristic matrix of Af . \^ " / \ ^ 



9.69. Let r be a linear operator on a vector space V of finite dimension. Let Tf' be a subspace of V 
invariant under T, i.e. T(W) cW. Let Tyf-.W-^W be the restriction of T to W. (i) Show that 
the characteristic polynomial of Tt„ divides the characteristic polynomial of T. (ii) Show that the 
minimum polynomial of Tyf divides the minimum polynomial of T. 



9.70. Let A = 



A(«) 



ail «i2 <*i3 

'*21 "^22 "'•23 
, a^i a^2 <^33 



Show that the characteristic polynomial of A is 

(ail + 022 + a33)f2 + 



"ll ''■12 
"^21 '*22 



"•11 "'13 
"31 "33 



"22 "23 
"32 "33 



t 



Hi 



^■13 



"21 "22 "23 
"31 032 "S3 



9.71. Let A be an m-square matrix. The determinant of the matrix of order n — m obtained by deleting 
the rows and columns passing through m diagonal elements of A is called a principal minor of degree 
n~m. Show that the coefficient of t™ in the characteristic polynomial A(t) = \tI — A\ is the sum 
of all principal minors of A of degree n — m multiplied by (— l)"-™. (Observe that the preceding 
problem is a special case of this result.) 



220 



EIGENVALUES AND EIGENVECTORS 



[CHAP. 9 



9.72. Consider an arbitrary monic polynomial f(t) = t" + a„_if"^i + ••• + a^t + a^. The following 
n-square matrix A is called the companion matrix of f{t): 






. 


. 


—do 


1 


. 


. 


-«1 





1 . 


. 


-«£ 





. 


. 1 


-ftn-l 



Show that f{t) is the minimum polynomial of A. 

9.73. Find a matrix A whose minimum polynomial is (i) t^ - St^ + 6t + 8, (ii) t^ - 51^ - 2t + 7t + 4. 

DIAGONALIZATION 

9.74. Let A = ( , ) be a matrix over the real field R. Find necessary and sufficient conditions on 

\c dj 

a, b, c and d so that A is diagonalizable, i.e. has two linearly independent eigenvectors. 

9.75. Repeat the preceding problem for the case that A is a matrix over the complex field C. 

9.76. Show that a matrix (operator) is diagonalizable if and only if its minimal polynomial is a product 
of distinct linear factors. 

9.77. Let A and B be Ji-square matrices over K such that (i) AB = BA and (ii) A and B are both 
diagonalizable. Show that A and B can be simultaneously diagonalized, i.e. there exists a basis of 
K^ in which both A and B are represented by diagonal matrices. (See Problem 9.57.) 

9.78. Let E iV ^V be a projection operator, i.e. E^ = E. Show that E is diagonalizable and, in fact, 
can be represented by the diagonal matrix A = i ^ j where r is the rank of E. 



Answers to Supplementary Problems 



-26 -3 
9.36. f(A) = ( 5 _27 ) . 9{A) 



-40 39\ /3 6\ /3 12 

-65 -27)' /(^)=(o ,)' ^(^) = (o 15 



9.37. f(T)(x, y) = (4a; - y, -2x + 5y). 

9.38. f(D){v(x)) = -5aa;2 + (4a - 5b)x + (2a + 26 - 5c). 



"'• --': I)' ^'-(i t)' ^-ii: 



'2 a 61 
9.40. Hint. Let A = \ 2 c ) . Set B = A^ and then obtain conditions on a, 6 and c. 
.0 2j 

9.44. (ii) Using (i), we have (/(A))« = /(A«) = /(A). 

9.45. (i) Xi = 1, M = (2, -1); Xg = 4, -u = (1, 1). 
(ii) Xi = 1, M = (2, -3); X2 = 6, 1; = (1, 1). 
(iii) \ = 4, u= (1,1). 

2 1\ / 2 1\ 

= ( 1. P3 does not exist since C has only one independent 

eigenvector, and so cannot be diagonalized. 



Let Pi = ( _^ J ) and P2 



CHAP. 9] EIGENVALUES AND EIGENVECTORS 221 

9.46. (i) Xi = 2, M = (1, -1, 0), V = (1, 0, -1); Xj = 6, w = (1, 2, 1). 
(ii) Xi = 3, M = (1, 1, 0), V = (1, 0, 1); X2 = 1, w = (2, -1, 1). 
(iii) X = 1, tt = (1, 0, 0), V = (0, 0, 1). 

/ 1 1 1\ /I 1 2\ 

Let Pj = — 1 2 and P2 — \ 1 — 1 I . P3 does not exist since C has at most two 

\ -1 1/ \0 1 1/ 

linearly independent eigenvectors, and so cannot be diagonalized. 

9.47. (i) X = 3, M = (1,-1); (ii) B has no eigenvalues (in R). 

9.48. (i) X = 3, M = (1, -1). (ii) Xj = 2i, u = (1, 3 - 2i); Xg = -2i, i; = (1, 3 + 2i> 

9.49. (i) Xi = 2, M = (3, -1); X2 = 6, 1) =: (1, 1). (ii) Xi = 1, m = (1, 1); X2 = -1, v = (1, -1). (iii) There 
are no eigenvalues (in B). 

9.50. (i) Xi = 1, M = (1, 0, 0); Xj = 4, 1; = (1, 1, 2). 

(ii) X = 1, M = (1, 0, 0). There are no other eigenvalues (in R). 

(iii) Xi = 1, M = (1, 0, -1); X2 = 2, ■!; = (2, -2, -1); X3 = 3, w = (1, -2, -1). 

9.51. (i) Xi = 1, M = (1,0); X2 = i, 1) = (1,1 + i). (ii) X = 1, m = (1,0). (iii) Xj = 2, u = (3,i); Xj = -2, 
V = (1, -t). (iv) Xi = t, M = (2, 1 - i); Xg = — i, v = (2, 1 + i). 

9.56. Let A = ( ^ ) . Then X = 1 is the only eigenvalue and v = (1, 0) generates the eigenspace 

\^ '' /I 0\ 

of X = 1. On the other hand, for A* = ( j , X = 1 is still the only eigenvalue, but w = (0, 1) 

generates the eigenspace of X = 1. ^ 

9.57. Let v G W, and so T{v) = Xv. Then T(Sv) = S(Tv) - S(\v) = \(Sv), that is, Sv is an eigenvector 
of T belonging to the eigenvalue X. In other words, Sv e W and thus S(W) C W. 

9.58. Let T:W-*W be the restriction of T to W. The characteristic polynomial of T is a polynomial 
over the complex field C which, by the fundamental theorem of algebra, has a root X. Then X is an 
eigenvalue of T, and so T has a nonzero eigenvector in W which is also an eigenvector of T. 

9.59. Suppose T(v) = Xv. Then {kT){v) - kT{v) = k(\v) = (k\)v. 

9.60. (i) /(<) = t^-St + 43, (ii) g(t) ^t^-8t + 23, (iii) h(t) = t^ - 6*2 + 5f - 12. 

9.62. (i) A(t) = (t-2)3(t-7)2; m(f) = (t-2)2(t-7). (ii) Ht) = (<-3)«; m(<) = (t-3)». (iii) A(«) = 
(t-X)5; m(«) = «-X. 

/O -8\ 
9.73. Use the result of Problem 9.72. (i) A = h -6 , (ii) A = 

\0 1 5/ 

9.77. Hint. Use the result of Problem 9.57. 




chapter 10 



Canonical Forms 

INTRODUCTION 

Let r be a linear operator on a vector space of finite dimension. As seen in the preceding 
chapter, T may not have a diagonal matrix representation. However, it is still possible 
to "simplify" the matrix representation of T in a number of ways. This is the main topic 
of this chapter. In particular, we obtain the primary decomposition theorem, and the 
triangular, Jordan and rational canonical forms. 

We comment that the triangular and Jordan canonical forms exist for T if and only if 
the characteristic polynomial A{t) of T has all its roots in the base field K. This is always 
true if K is the complex field C but may not be true if K is the real field R. 

We also introduce the idea of a quotient space. This is a very powerful tool and will be 
used in the proof of the existence of the triangular and rational canonical forms. 

TRIANGULAR FORM 

Let r be a linear operator on an n-dimensional vector space V. Suppose T can be rep- 
resented by the triangular matrix 

(an ai2 ... ttin \ 
(122 . . . 0,2n 

ann I 

Then the characteristic polynomial of T, 

A{t) = |*/-A| = {t - an){t - a^i) . . . [t - ann) 
is a product of linear factors. The converse is also true and is an important theorem; 
namely, 

Theorem 10.1: Let T:V-^V be a linear operator whose characteristic poljmomial factors 
into linear polynomials. Then there exists a basis of V in which T is 
represented by a triangular matrix. 

Alternate Form of Theorem 10.1: Let A be a square matrix whose characteristic poly- 
nomial factors into linear polynomials. Then A is 
similar to a triangular matrix, i.e. there exists an in- 
vertible matrix P such that P'^AP is triangular. 

We say that an operator T can be brought into triangular form if it can be represented 
by a triangular matrix. Note that in this case the eigenvalues of T are precisely those 
entries appearing on the main diagonal. We give an application of this remark. 

222 



CHAP. 10] 



CANONICAL FORMS 



223 



Example 10.1 : Let A be a square matrix over the complex field C. Suppose X is an eigenvalue of A2. 
Show that a/x or -Vx is an eigenvalue of A. We know by the above theorem that 
A is similar to a triangular matrix 



/Ml 



B = 



Hence A^ is similar to the matrix 



52 = 



1^2 



/.? 



Since similar matrices have the same eigenvalues, X = ya? for some i. Hence 
/jj = VX or ^j = — -y/x; that is, Vx or - Vx is an eigenvalue of A. 

INVARIANCE 

Let T:V-*V be linear. A subspace IF of 7 is said to be invariant under T or 
T-invariant if T maps W into itself, i.e. if vGW implies T{v) G W. In this case T 
restricted toW defines a linear operator on W; that is, T induces a linear operator f:W-*W 
defined by T{w) = T{w) for every w GW. 

Example 10.2: Let T : K^ ^ R3 be the linear operator which rotates each vector about the z axis 
by an angle e: 

T(x, y, z) = {x cose — y sin e, x sine + y cos e, z) 

Observe that each vector w = - T(v) 

{a, b, 0) in the xy plane W remains .fj ^ 

in W under the mapping T, i.e. W ^ ' 

is r-invariant. Observe also that 
the z axis U is invariant under T. 
Furthermore, the restriction of T 
to W rotates each vector about the 
origin O, and the restriction of T 
to TJ is the identity mapping on U. 

Example 10.3: Nonzero eigenvectors of a linear operator T : V ^ V may be characterized as gen- 
erators of T-invariant 1-dimensional subspaces. For suppose T{^v) = \v, v # 0. 
Then W = {kv, k e K}, the 1-dimensional subspace generated by v, is invariant 
under T because 

T{kv) = k T(v) = k(\v) = kWEiW 

Conversely, suppose dim 17 = 1 and m ^ generates U, and U is invariant under 
T. Then T{u) e V and so T(u) is a multiple of u, i.e. T(u) = /lu. Hence u is an 
eigenvector of T. 

The next theorem gives us an important class of invariant subspaces. 

Theorem 10.2: Let T:V^V be linear, and let f{t) be any polynomial. Then the kernel 
of f{T) is invariant under T. 




The notion of invariance is related to matrix representations as follows. 

Theorem 10.3: Suppose W is an invariant subspace of T:V-^V. Then T has a block 

A B' 



matrix representation [ q r^ ] where A is a matrix representation of 
the restriction of T to W. 



224 



CANONICAL FORMS [CHAP. 10 



INVARIANT DIRECT-SUM DECOMPOSITIONS 

A vector space V is termed the direct sum of its subspaces Wi, . . .,Wr, written 

if every vector v GV can be written uniquely in the form 

1) = wi + W2+ • • • + iVr with Wi G Wi 

The following theorem applies. 

Theorem 10.4: Suppose Wi, ...,Wr are subspaces of V, and suppose 

{Wn, . . . , Wm^}, . . ., {WrU . . . , tVrn^} 

are bases of Wi,...,Wr respectively. Then V is the direct sum of the 
Wi if and only if the union {wn, . . .,wi„i, . . ..wn, . . .,w™J is a basis 
of V. 
Now suppose T-.V-^V is linear and V is the direct sum of (nonzero) T-invariant 
subspaces Wi, . . .,Wr: 

V = Wi® ■■■ ®Wr and T{Wi) cWi, i^l,...,r 

Let Ti denote the restriction of T to Wi. Then T is said to be decomposable into the operators 
Ti or T is said to be the direct sum of the Ti, written T = Ti © • • • Tr. Also, the sub- 
spaces Wi, ...,Wr are said to redvxe Tor to form a T-invariant direct-sum decomposition of F. 
Consider the special case where two subspaces U and W reduce an operator T:V-^V; 
say, dim C/ = 2 and dim W = S and suppose {ui, u^} and (wi, W2, ws) are bases of [/ and 
W respectively. If Ti and T2 denote the restrictions of T to C7 and W respectively, then 

T2{wi) = bnWi + h\2W2 + bi3W3 
Ti (ui) = anUi + ai2U2 „ , , , , r, ,1, 

'^ ' T2{W2) = &21W1 + &22W2 + &23W3 

Tl (U2) = 0.21^1 + a22U2 rr, , X I. , U I J> .„ 

^ -^ T2(W3) = bsiWl + hz2W2 + O33W3 



Hence 



'&n &21 ^31 
A = f "" "'' ^ and B = I 612 ^22 b 



an a2i 



ttl2 0^22 



I 613 &23 ^"33 1 



are matrix representations of Ti and Ta respectively. By the above theorem {mi, M2, wi, W2, wz) 
is a basis of V. Since r(tti) = T,{Ui) and r(Wi) = r2(Wj), the matrix of T in this basis is 

the block diagonal matrix I „ 1 . 

A generalization of the above argument gives us the following theorem. 

Theorem 10.5: Suppose T:V^V is linear and V is the direct sum of T-invariant sub- 
spaces Wu •■•, Wr. If Ai is a matrix representation of the restriction of 
T to Wi, then T can be represented by the block diagonal matrix 



M 



[Ai ... 
A2 ... 

... A, 



The block diagonal matrix M with diagonal entries Ai, . . ., A. is sometimes called the 
direct sum of the matrices Ai, . . . , Ar and denoted by M = Ai © • • • © Ar. 



CHAP. 10] CANONICAL FORMS 225 

PRIMARY DECOMPOSITION 

The following theorem shows that any operator T:V^V is decomposable into oper- 
ators whose minimal polynomials are powers of irreducible pols^omials. This is the first 
step in obtaining a canonical form for T, 

Primary Decomposition Theorem 10.6: Let T:V^V be a linear operator with minimal 
polynomial 

m{t) = /i(f)">/2(t)"^... /.(*)"' 

where the fi{f) are distinct monic irreducible polynomials. Then V is the 
direct sum of T-invariant subspaces Wi, . . .,Wr where Wi is the kernel of 
fi{T)"'. Moreover, /i(;t)«i is the minimal polynomial of the restriction of 
T to Wi. 

Since the polynomials /i(i)"* are relatively prime, the above fundamental result follows 
(Problem 10.11) from the next two theorems. 

Theorem 10.7: Suppose T:V^V is linear, and suppose f{t) = git)h(t) are polynomials 
such that f{T) = and g{t) and h(t) are relatively prime. Then V is the 
direct sum of the T-invariant subspaces U and W, where U = Ker g{T) 
and W = Ker h{T). 

Theorem 10.8: In Theorem 10.7, if f{t) is the minimal polynomial of T [and g{t) and h{t) 
are monic], then g{t) and h{t) are the minimal polynomials of the restric- 
tions of T to U and W respectively. 

We will also use the primary decomposition theorem to prove the following useful 
characterization of diagonalizable operators. 

Theorem 10.9: A linear operator T -.V ^V has a diagonal matrix representation if and 
only if its minimal polynomial m{t) is a product of distinct linear 
polynomials. 

Alternate Form of Theorem 10.9: A matrix A is similar to a diagonal matrix if and only 
if its minimal polynomial is a product of distinct linear polynomials. 

Example 10.4: Suppose A # 7 is a square matrix for which A^ = I. Determine whether or not 
A is similar to a diagonal matrix if A is a matrix over (i) the real field R, (ii) the 
complex field C. 

Since A^ - I, A is a zero of the polynomial f(t) = t^-1 = {t- l){t^ +t + l). 
The minimal polynomial m(t) of A cannot be t — 1, since A ¥' I. Hence 

m{t) = t2 + t + 1 or m(t) = t^ - X 

Since neither polynomial is a product of linear polynomials over R, A is not diag- 
onalizable over R. On the other hand, each of the polynomials is a product of distinct 
linear polynomials over C. Hence A is diagonalizable over C. 



NILPOTENT OPERATORS 

A linear operator T : F -» V is termed nilpotent if T" = for some positive integer n; 
we call k the index of nilpotency of T if T'' — but T''-^ ¥= 0. Analogously, a square matrix 
A is termed nilpotent if A" = for some positive integer n, and of index fc if A'' = but 
yj^k-i ^ Clearly the minimum polynomial of a nilpotent operator (matrix) of index k is 
m{t) — f"; hence is its only eigenvalue. 

The fundamental result on nilpotent operators follows. 

Theorem 10.10: Let T:V-^V be a nilpotent operator of index k. Then T has a block 
diagonal matrix representation whose diagonal entries are of the form 



226 



CANONICAL FORMS 



[CHAP. 10 






1 


. 


. 











1 . 


. 











. 


. 


1 








. 


. 






N 



(i.e. all entries of A^ are except those just above the main diagonal where 
they are 1). There is at least one N of order k and all other N are of orders 
^ k. The number of N of each possible order is uniquely determined by 
T. Moreover, the total number of N of all orders is equal to the nullity 
of T. 

In the proof of the above theorem, we shall show that the number of N of order i is 
2mi — mi+i — Mi- 1, where mj is the nullity of T\ 

We remark that the above matrix N is itself nilpotent and that its index of nilpotency is 
equal to its order (Problem 10.13). Note that the matrix N of order 1 is just the 1 x 1 zero 
matrix (0). 



JORDAN CANONICAL FORM 

An operator T can be put into Jordan canonical form if its characteristic and minimal 
polynomials factor into linear polynomials. This is always true if K is the complex field C. 
In any case, we can always extend the base field Z to a field in which the characteristic 
and minimum polynomials do factor into linear factors; thus in a broad sense every operator 
has a Jordan canonical form. Analogously, every matrix is similar to a matrix in Jordan 
canonical form. 



Theorem 10.11: 



Let T:V ->■¥ be a linear operator whose characteristic and minimum 
polynomials are respectively 

A{t) = (t- Ai)"' ...(*- XrY' and m{t) = (i - Ai)"' ...{t- Xr)™- 
where the Ai are distinct scalars. Then T has a block diagonal matrix 
representation / whose diagonal entries are of the form 

/A; 1 ... 0\ 

Ai 1 ... 



«/ ij — 













Ai 




Ai/ 



For each A. the corresponding blocks Ja have the following properties: 
(i) There is at least one Ja of order mi; all other Ja are of order ^ mi. 
(ii) The sum of the orders of the Ja is m. 
(iii) The number of Ja equals the geometric multiplicity of Ai. 
(iv) The number of Ja of each possible order is uniquely determined by T. 
The matrix J appearing in the above theorem is called the Jordan canonical form of the 
operator T. A diagonal block Ja is called a Jordan block belonging to the eigenvalue Ai. 
Observe that 

Ai ... 
Ai ... 

+ 
... Ai ' 
. . . Ai 



Ai 


1 


. 


. 


^\ 







Ai 


1 . 


. 













. 


. Ai 


1 








.. 


. 


A J 








1 





.. 











1 


.. 














.. 


1 











.. 






CHAP. 10] CANONICAL FORMS 227 

That is, 

Jtj = Xil + N 

where N is the nilpotent block appearing in Theorem 10.10. In fact, we prove the above 
theorem (Problem 10.18) by showing that T can be decomposed into operators, each the sum 
of a scalar and a nilpotent operator. 

Example 105: Suppose the characteristic and minimum polynomials of an operator T are respec- 
tively 

A(«) = (f-2)4(t-3)3 and m{t) = («-2)2(t-3)2 

Then the Jordan canonical form of T is one of the following matrices: 



or 




The first matrix occurs if T has two independent eigenvectors belonging to its eigen- 
value 2; and the second matrix occurs if T has three independent eigenvectors be- 
longing to 2. 



CYCLIC SUBSPACES 

Let r be a linear operator on a vector space V of finite dimension over K. Suppose 
V GV and v ^ 0. The set of all vectors of the form f{T){v), where f{t) ranges over all 
polynomials over K, is a T-invariant subspace of V called the T-cyclic subspace of V gen- 
erated by v;we denote it by Z{v, T) and denote the restriction of T to Z{v, T) by r„. We 
could equivalently define Z{v,T) as the intersection of all T-invariant subspaces of V 
containing v. 

Now consider the sequence 

V, T{v), T\v), T\v), . . . 

of powers of T acting on v. Let k be the lowest integer such that T''{v) is a linear com- 
bination of those vectors which precede it in the sequence; say, 

T^iv) = -ttfc-i T'^-^v) - ... - aiT{v) - aov 
Then m„(i) = t" + ak-it^-^ + ■ ■ ■ + ait + ao 

is the unique monic polynomial of lowest degree for which mv(T) (v) = 0. We call m„(i) the 
T-annihilator of v and Z{v, T). 

The following theorem applies. 
Theorem 10.12: Let Z(v, T), T^ and m„(i) be defined as above. Then: 

(i) The set {v, T{v), ..., r'=-i (v)} is a basis of Z{v, T); hence dim Z{v, T) = fe. 

(ii) The minimal polynomial of T„ is m„(i). 

(iii) The matrix representation of Tv in the above basis is 



228 



CANONICAL FORMS 



[CHAP. 10 









. 


. 


— tto 


1 





. 


. 


-ai 





1 


. 


. 


-tti 








. 


. 


— aic-2 








. 


. 1 


— aic-i 



The above matrix C is called the companion matrix of the polynomial m„(t). 



RATIONAL CANONICAL FORM 

In this section we present the rational canonical form for a linear operator T:V^V. 
We emphasize that this form exists even when the minimal polynomial cannot be factored 
into linear polynomials. (Recall that this is not the case for the Jordan canonical form.) 

Lemma 10.13: Let T:V-*V be a linear operator whose minimal polynomial is /(*)" where 
f{t) is a monic irreducible polynomial. Then V is the direct sum 

V = Z{vi, T) © • • • e Zivr, T) 
of T-cyclic subspaces Z{Vi, T) with corresponding T-annihilators 

/(*)"', /(«)"^ ■■-, fit)"', n = Ml ^ %2 - • • • - Wr 

Any other decomposition of V into jT-cyclic subspaces has the same number 
of components and the same set of T-annihilators. 
We emphasize that the above lemma does not say that the vectors vi or the T-cyclic sub- 
spaces Zivi, T) are uniquely determined by T; but it does say that the set of T-annihilators 
are uniquely determined by T. Thus T has a unique matrix representation 



\ Cr 

where the d are companion matrices. In fact, the Ci are the companion matrices to the 
polynomials /(*)"*. 

Using the primary decomposition theorem and the above lemma, we obtain the following 
fundamental result. 
Theorem 10.14: Let T:V^V be a linear operator with minimal polynomial 

m{t) = fi{tr^f2{tr- ... fsitr- 

where the /{(«) are distinct monic irreducible polynomials. Then T has a 
unique block diagonal matrix representation 

'Cn \ 



Clrj 



Cs 



where the C« are companion matrices. In particular, the C« are the com- 
panion matrices of the polynomials /i(t)"« where 



mi = nil — ni2 



= ni: 



\' 



rris = TCsl — ^52 — • • • — Msr. 



CHAP. 10] 



CANONICAL FORMS 



229 



The above matrix representation of T is called its rational canonical form. The poly- 
nomials /i(i)»" are called the elementary divisors of T. 

Example 10.6: Let V be a vector space of dimension 6 over R, and let T be a linear operator whose 
minimal polynomial is m{t) = (t^-t + 3)(« - 2)2. Then the rational canonical form 
of T is one of the following direct sums of companion matrices: 

(i) C(t2-( + 3) e C(«2-t + 3) © C((<-2)2) 

(ii) C{f2-t + 3) © C((t-2)2) © C((t- 2)2) 

(iii) C(t2-t + 3) © C((t-2)2) © C(f-2) © C(t-2) 

where C(f(t)) is the companion matrix of /(t); that is, 




/O 



-3 I 



.L__ 



12 



(i) 



(ii) 



(iii) 



QUOTIENT SPACES 

Let F be a vector space over a field K and let T7 be a subspace of V. If v is any vector 
in V, we write v + W for the set of sums v + w with w GW: 

V + W = {V + w: wGW) 

These sets are called the cosets of W in V. We show (Problem 10.22) that these cosets 
partition V into mutually disjoint subsets. 



v + w 




Example 10.7: Let W be the subspace of R2 defined 

by 

W = {(a, b): a=b} 

That is, W is the line given by the 
equation x — y = 0. We can view 
V + W as a, translation of the line, 
obtained by adding the vector v to 
each point in W. As noted in the 
diagram on the right, v +W is also 
a line and is parallel to W. Thus 
the cosets of W in R2 are precisely 
all the lines parallel to W. 

In the next theorem we use the cosets of a subspace W of a vector space V to define a 
new vector space; it is called the quotient space ofVhyW and is denoted by V/W. 

Theorem 10.15: Let W he a subspace of a vector space over a field K. Then the cosets of 
WinV form a vector space over K with the following operations of addi- 
tion and scalar multiplication: 

(i) {u + W) + {v + W) = {u + v) + W 
(ii) kiu + W) = ku + W, where kGK. 

We note that, in the proof of the above theorem, it is first necessary to show that the 
operations are well defined; that is, whenever u + W = u' + W and v + W = v' + W, then 

(i) {u + v) + W = (u' + V') + W and (ii) ku+W - ku' + W, for any k&K 



230 CANONICAL FORMS [CHAP. 10 

In the case of an invariant subspace, we have the following useful result. 

Theorem 10.16: Suppose W is a subspace invariantunder a linear operator^ T : V -» V. 
Then T induces a linear operator f on V/W defined by T{v -\-W) = 
T{v) + W. Moreover, if T is a zero of any polynomial, then so is T. 
Thus the minimum polynomial of T divides the minimum polynomial of T. 



Solved Problems 

INVARIANT SUBSPACES 

10.1. Suppose T:V -^V is linear. Show that each of the following is invariant under T: 
(i) {0}, (ii) V, (iii) kernel of T, (iv) image of T. 

(i) We have T(Q) = G {0}; hence {0} is invariant under T. 

(ii) For every v G. V, T(v) € V; hence V is invariant under T. 

(iii) Let u e Ker T. Then ^(m) = S Ker T since the kernel of T is a subspace of V. Thus 
Ker T is invariant under T. 

(iv) Since T{v) G Im T for every v eV, it is certainly true if v G Im T. Hence the image of 
T is invariant under T. 

10.2. Suppose {Wi} is a collection of T-invariant subspaces of a vector space V. Show that 
the intersection W= HiWi is also T-invariant. 

Suppose V eW; then v e Wi for every i. Since Wj is T-invariant, T(v) G Wj for every i. 
Thus tIv) €:W= riiWi and so W is T-invariant. 

10.3. Prove Theorem 10.2: Let T:V-^V be any linear operator and let f{t) be any poly- 
nomial. Then the kernel of f{T) is invariant under T. 

Suppose V G Ker/(r), i.e. f(T)(v) = 0. We need to show that ^(i;) also belongs to the kernel 
of /(r), i.e. f(T)(T(v)) = 0. Since f{t)t=tf(t), we have f(T)T=Tf(T). Thus 

f(T)T(v) = Tf(T){v) = T(0) = 
as required. 

10.4. Find all invariant subspaces of A - ( J viewed as an operator on R^. 

First of all, we have that R^ and {0} are invariant under A. Now if A has any other invariant 
subspaces, then it must be 1-dimensional. However, the characteristic polynomial of A is 



A(t) = \tI-A\ = 



t-2 5 
-1 t + 2 



= t2 + 1 



Hence A has no eigenvalues (in R) and so A has no eigenvectors. But the 1-dimensional invariant 
subspaces correspond to the eigenvectors; thus R2 and {0} are the only subspaces invariant under A. 

10.5. Prove Theorem 10.3: Suppose W is an invariant subspace of T:V-^V. Then T 

has a block diagonal matrix representation ( „ J where A is a matrix representa- 

tion of the restriction T of T to W. ^ ^ 

We choose a basis {wi, ....wj of W and extend it to a basis {w^, . . .,Wr,Vi v^} of V. 

We have 



CHAP. 10] CANONICAL FORMS 231 

A. 
A. 

T{W2) = T(W2) = a2iWi + • • ■ + Oar^r 



T{Wr) — r(Wr) = a^iWi + • • • + ftrr^r 

^(i;!) = b^jWi + • ■ • + bi^w^ + Cij-Wj + • • • + c^^v^ 

T{V<;^ = 621«'l + • • • + hr'^r + C21^1 + • • • + Cas^j 
^^(■ys) = 6slWl + • • • + ftsr^r + CjiVi + • • • + Css^s 

But the matrix of T in this basis is the transpose of the matrix of coefficients in the above system 
of equations. (See page 150.) Therefore it has the form ( ) where A is the transpose of 



C^ 

the matrix of coefficients for the obvious subsystem. By the same argument, A is the matrix of 
T relative to the basis {Wj} of W. 

10.6. Let T denote the restriction of an operator T to an invariant subspace W, i.e. 
T{w) — T{w) for every w GW. Prove: 

(i) For any polynomial f(t), /(f)(w) = fiT)(w). 

(ii) The minimum polynomial of T divides the minimum polynomial of T. 

(i) If /(*) = or if f{t) is a constant, i.e. of degree 1, then the result clearly holds. Assume 
deg/ = n > 1 and that the result holds for polynomials of degree less than n. Suppose that 

/{*) = a„t" + a„_i f»-i + • • • + ajt + oo 

Then f(T){w) = (a„r» + o„_ir"-i + • • • + ao/)(w) 

= (a„h-i)(T(w)) + (a„_i rn-i + . . . + aoI)(w) 

= (a„r»-i)(r(w)) + (o„_ir«-i + ■ • • + oo/)(w) 

= fiTHw) 

(ii) Let m(t) denote the minimum polynomial of T. Then by (i), m(T)(w) = m{T)(w) = 0(to) = 
for every w S W; that is, T is a zero of the polynomial m(t). Hence the minimum polynomial 
of T divides m{t). 

INVARIANT DIRECT-SUM DECOMPOSITIONS 

10.7. Prove Theorem 10.4: Suppose Wi, . . .,Wr are subspaces of V and suppose, for 
i = l, . . .,r, {wii, . . ., Wi„^} is a basis of Wu Then V is the direct sum of the Wi if 
and only if the union 

, . „ ,, B = {Wu, . . . , Win,, . . . , Wrl, . . ., Wm.) 

IS a basis of V. 

Suppose B is a basis of V. Then, for any v &V, 

V = duWii + • • • + ai„jWi„j + ■ • • + a^iW^i + • • • + a^w^^ = Wi + W2+ ■•■ + w^ 
where Wj = a-ai^n + ■ • • + ai„.Wi„. G PTj. We next show that such a sum is unique. Suppose 

V — w'l + W2 + ■ • ■ + w'r where W; S Wi 

Since {wji, . . . , Win.} is a basis of Wi, w[ = b^Wn + • • • + 6i„.W{„. and so 

V = 611W11 + • • ■ + 6i„jWi„^ + • • • + ftrl^rl + ■ • ■ + Kn^^m^ 

Since B is a basis of V, Oy = 6y, for each i and each j. Hence w^ = w,' and so the sum for v 
is unique. Accordingly, V is the direct sum of the PFj. 

Conversely, suppose V is the direct sum of the W^. Then for any v GV, v = Wj + • • • + w, 
where Wj G PFj. Since {Wy.} is a basis of Wi, each w^ is a linear combination of the Wy. and so v 
is a linear combination of the elements of B. Thus B spans V. We now show that B is linearly 
independent. Suppose 

"llWli + • • • + «!„ Win, + ■ ■ • + an^ri + 



232 CANONICAL FORMS [CHAP. 10 

Note that aa«'ii + • • • + ain.Wm. G W'j. We also have that = + 0+---+0 where G Wi- Since 
such a sum for is unique, 

aaWti + • • ■ + ai„.Wi„. = for i = 1, . . . , r 

The independence of the bases {wy.} imply that all the a's are 0. Thus B is linearly independent 
and hence is a basis of V. 

10.8. Suppose T:V^V is linear and suppose T = Ti © ^2 with respect to a T-invariant 
direct-sum decomposition V = U ®W. Show that: 
(i) m{t) is the least common multiple of mi{t) and m2{t) where m{t), mi{t) and in2{t) 

are the minimum polynomials of T, Ti and T2 respectively; 
(ii) A{t) - Ai{t) A2{t), where A{t), Ai(t) and A2(t) are the characteristic polynomials of 
T, Ti and T2 respectively. 

(i) By Problem 10.6, each of TOi(t) and m^it) divides m(t). Now suppose f{t) is a multiple of both 
Wi(() and m2(t); then f{Ti){U) = and /(r2)(W) = 0. Let vGV; then v - u + w with 
M e C/ and w G W. Now 

/(T) V = /(r) w + f(T) w = /(Ti) M + /(T2) w = + = 

That is, r is a zero of f(t). Hence m(t) divides f{t), and so m{t) is the least common multiple of 
Wi(t) and m2{t). 

(ii) By Theorem 10.5, T has a matrix representation M = ( j where A and B are matrix 

representations of T^ and T2 respectively. Then, by Problem 9.66, 

tl - A 



A(t) = \tI-M\ = 
as required. 



tl - B 



= \tI-A\\tI-B\ = Ai(t)A2(t) 



10.9. Prove Theorem 10.7: Suppose T:V-*V is linear, and suppose f{t) = g{t) h{t) are 
polynomials such that /(T) = and g{t) and h{t) are relatively prime. Then V 
is the direct sum of the T-invariant subspaces U and W where C7 = Kerflr(r) and 
W = Kerh{T). 

Note first that U and W are T-invariant by Theorem 10.2. Now since g(t) and h(t) are relatively 
prime, there exist polynomials r(t) and s(t) such that 

r(t) sr(t) + s{t) h(t) = 1 

Hence for the operator T, r(T) g(T) + s{T) h(T) = I (*) 

Let veV; then by (*), v = r(T) g(T) v + 8(T) h(T) v 

But the first term in this sum belongs to W — KerhCT) since 

h(T) r{T) g(T) V - r(T) g{T) h(T) v = r(T)f(T)v = r(T)(iv = 

Similarly, the second term belongs to U. Hence V is the sum of U and W. 

To prove that V = U ®W, we must show that a sum v — u + w with u&V, w &W, is 

uniquely determined by 1;. Applying the operator r{T)g(T) to v = m + w and using g(T)u = 0, 

we obtain ,„ __ 

r(r)ff(r)i> = r(T)g(T)u + r(r)fl'(r)w = r(r)s'(r)w 

Also, applying (*) to w alone and using h{T) w = 0, we obtain 

w = r(r)flr{r)w + 8(T)h(T)w = r(2')flr(T)w 

Both of the above formulas give us w = t(T) g(T) v and so w is uniquely determined by v. Sim- 
ilarly M is uniquely determined by v. Hence V — V @W, as required. 



CHAP. 10] CANONICAL FORMS 233 

10.10. Prove Theorem 10.8: In Theorem 10.7 (Problem 10.9), if f{t) is the minimal poly- 
nomial of T (and g{t) and h{t) are monic), then g{t) is the minimal polynomial of the 
restriction Ti of T to U and h(t) is the minimal polynomial of the restriction Tz of 
rto W. 

Let mi(t) and mgCf) be the minimal polynomials of T^ and T2 respectively. Note that 9(Tj) = 
and h(T2) = because U = Ker g(T) and W = Kerh(T). Thus 

mi(t) divides g(t) and m2{t) divides h{t) (1) 

By Problem 10.9, f{t) is the least common multiple of mi(t) and nizit). But mi{t) and m2(t) are 
relatively prime since g{t) and h{t) are relatively prime. Accordingly, f(t) = mj(t) m,2(t). We also 
have that f{t) — g(t) h(t). These two equations together with (1) and the fact that all the polynomials 
are monic, imply that g(t) — •mi(t) and h{t) = m^^t), as required. 



10.11. Prove the Primary Decomposition Theorem 10.6: Let T : 7 -» F be a linear operator 
with minimal polynomial 

-mit) = /l(i)"i/2(i^.../r(<)"' 

where the fi{t) are distinct monic irreducible polynomials. Then V is the direct sum 
of T-invariant subspaces Wi, ...,Wr where Wi is the kernel of fi{TY\ Moreover, 
/i(i)"' is the minimal polynomial of the restriction of T to Wu 

The proof is by induction on r. The case r = 1 is trivial. Suppose that the theorem has been 
proved for r — 1. By Theorem 10.7 we can write V as the direct sum of T-invariant subspaces W^ 
and Fi where W^ is the kernel of /i(r)"i and where V^ is the kernel of fziT)"^ . . . /r(r)"'. By 
Theorem 10.8, the minimal polynomial of the restrictions of T to TFj and Vi are respectively /i(f)"i 
and/2(«)"2 ... /r («)"'. 

Denote the restriction of T to V^ by T^. By the inductive hypothesis, V^ is the direct sum of 
subspaces W2, ■ ■ .,'W^ such that 'W^ is the kernel of /{(Ti)". and such that /((<)»! is the minimal poly- 
nomial for the restriction of T^ to PT,-. But the kernel of fi{T)"i, for i = 2, . . .,r is necessarily 
contained in V^ since /;(*)"' divides /2(t)"2 . . . /r(*)"''- Thus the kernel of /i(r)»i is the same as the 
kernel of fi{T^^i, which is W^. Also, the restriction of T to W^ is the same as the restriction of T^ 
to Wi (for i = 2, . . .,r); hence /;(*)"» is also the minimal polynomial for the restriction of T to Wi- 
Thus V = Wi®W2® ■■■ ®Wr is the desired decomposition of T. 



10.12. Prove Theorem 10.9: A linear operator T.V^V has a diagonal matrix representa- 
tion if and only if its minimal polynomial m{t) is a product of distinct linear 
polynomials. 

Suppose m{t) is a product of distinct linear polynomials; say, 

m{t) = (t-Xi){t-X2) ... (t-X,.) 

where the Xj are distinct scalars. By the primary decomposition theorem, V is the direct sum of 
subspaces Wi,...,Wr where Wj = Ker(7'-Xi/). Thus ii v e Wi, then (T-\iI){v) = or 
T(v) — Xj-y. In other words, every vector in TFj is an eigenvector belonging to the eigenvalue Xj. By 
Theorem 10.4, the union of bases for Wi, . . ., W^ is a basis of V. This basis consists of eigenvectors 
and so T is diagonalizable. 

Conversely, suppose T is diagonalizable, i.e. V has a basis consisting of eigenvectors of T. Let 
Xj, . . ., Xj be the distinct eigenvalues of T. Then the operator 

f(T) = {T-\J)(T-X2l)...(T-Kl) 

maps each basis vector into 0. Thus f{T) = and hence the minimum polynomial m(() of T divides 
the polynomial 

m = (t-Xi)(i-X2)...(t-X,/) 

Accordingly, m,(t) is a product of distinct linear polynomials. 



234 



CANONICAL FORMS 



[CHAP. 10 



NILPOTENT OPERATORS, JORDAN CANONICAL FORM 

10.13. Let T:V^V be linear. Suppose, for vGV, T''{v) = but f'-^v) ¥- 0. Prove: 
(i) The set S = {v, T{v), ..., T'^-^iv)} is linearly independent. 
(ii) The subspace W generated by -S is T-invariant. 
(iii) The restriction T of T to W is nilpotent of index k. 
(iv) Relative to the basis {T''-^{v), . . .,T{v),v} of W, the matrix of T is of the form 






1 


.. 


. 











1 .. 


. 











.. 


. 


1 








.. 


. 






Hence the above /c-square matrix is nilpotent of index k. 
(i) Suppose 



av + di T{v) + 02 T^v) + 



+ a^.^n-Hv) 







(*) 



Applying T'^-i to (*) and using r'=(i;) = 0, we obtain aT'<^-i(v) = 0; since Ti'-'^(v) ^ 0, a - 0. 
Now applying T^-z to (*) and using P'iv) = and a = 0, we find a^ r'=-i(i;) = 0; hence 
«! = 0. Next applying T''-^ to (*) and using T<^(v) = and a = ai = 0, we obtain 
a2T^~^{v) = 0; hence Og = 0. Continuing this process, we find that all the a's are 0; hence 
S is independent. 

(ii) Let veW. Then 

V = bv + biT(v) + biT^v) + ••■ + b^_iT'^-Hv) 

Using THv) = 0, we have that 

T{v) = bT{v) + biT2(v) + ■•• + b^.^^T'^-H'") ^ W 

Thus W is T-invariant. 



(iii) By hypothesis T''{v) = 0. Hence, for i = k—1, 

Tk{Ti(v)) = r'' + «(i;) = 

That is, applying T'^ to each generator of W, we obtain 0; hence T'^ = and so T is nilpotent 
of index at most fc. On the other hand, Tf^-^v) = T''-^v) ¥= 0; hence T is nilpotent of index 
exactly fc. 

(iv) For the basis {T'<'-^v), Ti'-^v), . . .,T{v),v} of W, 
T(T^-^(v)) = rk(i;) = 

r(rfc-3(^)) = r'=-2(-u) 



T(T{y)) 
T(v) 

Hence the matrix of T in this basis is 



T'^(v) 



T(v) 



1 


. 


. 








1 . 


. 








. 


. 


1 





. 


. 






CHAP. 10] CANONICAL FORMS 235 

10.14. Let T-.V-^V be linear. Let U = KerT' and W = KerT+\ Show that (i) UcW, 
(ii) T{W) C U. 

(i) Suppose ueU = Kern Then THu) = and so T'+Mm) = T(,Ti(u)) = T(0) = 0. Thus 
MGKerr* + i = W. But this is true for every m G f/; hence UcW. 

(ii) Similarly, if w G W' = Ker r*+i, then T'+Mw) = 0. Thus r'+Mw) = r*(r(w)) = r«(0) = 
and so r(W') c U. 



10.15. Let r : F ^ F be linear. Let X = Ker r*-^ Y = Ker 7*-^ and Z = Ker T*. By the 
preceding problem, XcY cZ. Suppose 

{Ml, .... Mr}, {Ml, . . . , Mr, Vi, . . . , Vs}, {Mi, . . . ,Ur, Vi, . . . , Vs, Wi, . . . , Wt} 

are bases of X, Y and Z respectively. Show that 

s = {Ml, . . ., Mr, r(wi), . . ., r(M;t)} 

is contained in Y and is linearly independent. 

By the preceding problem, T(Z) c Y and hence S CY. Now suppose S is linearly dependent. 
Then there exists a relation 

aiUi + • • • + a^Mr + 6i T(wi) + ■■■ + b^ T{wt) = 

where at least one coefficient is not zero. Furthermore, since {u^} is independent, at least one of the 
6fc must be nonzero. Transposing, we find 

bi T{wi) + • • • + 6t T(wt) = - aiUi - ■•■ - a^u^ e X = Ker P'^ 
Hence Ti-^(biT(wi) + • ■ ■ + btT(wt)) = 

Thus r->(6iWi + ■•• + 6tWt) = and so 5iWi + • • • + 6,Wt G r = KerT*-! 

Since {mj, Vj} generates Y, we obtain a relation among the Mj, i»j and Wj; where one of the coefficients, 
i.e. one of the 6^, is not zero. This contradicts the fact that {Mj, Vj, w^} is independent. Hence S 
must also be independent. 



10.16. Prove Theorem 10.10: Let T.V^V be a nilpotent operator of index k. Then T 
has a block diagonal matrix representation whose diagonal entries are of the form 



N 



There is at least one N of order k and all other N are of orders ^ k. The number of 
N of each possible order is uniquely determined by T. Moreover, the total number of 
N of all orders is the nullity of T. 

Suppose dimy = n. Let Wi = Ker T, W2 = Ker ra W^ = Ker T". Set m^ = dim W^j, 

for i — 1,.. .,k. Since T is of index k, W^ = V and Wj^-i # V and so m^_i <m^ — n. By 
Problem 10.17, 

WiCW^C ••• CW^ = V 

Thus, by induction, we can choose a basis {mj, . . .,m„} of V such that {u^, . . .,«„ > is a basis of PFj. 

We now choose a new basis for V with respect to which T has the desired form. It will be con- 
venient to label the members of this new basis by pairs of indices. We begin by setting 

v{l,k) = u^^_^ + i, ■w(2, fc) =M„^_j + 2, ..., •y(mfc-«tfc_i, fc) =M„j^ 






1 


. 


. 











1 . 


. 











. 


. 


1 








. 


. 






236 CANONICAL FORMS [CHAP. 10 

and setting 

v{l,k-l) = Tv{l,k), v{2,k-l) = Tv(2,k), ..., vim^-m^.-i, k-1) ^ Tv{m^-m^^i,k) 

By the preceding problem, 

Si = {Ml ..., u^^_^, v{l,k-l), ..., vCmfc-mfc^i, fe-1)} 

is a linearly independent subset of W^-i- We extend Sj to a basis of Wfc-i by adjoining new ele- 
ments (if necessary) which we denote by 

■y(mfe-m;,_i + l, fc-1), v(m^-mk-i + 2, k~l), ..., v(m^_i-- m^^^tk-V) 

Next we set 

v(l, k-2) = Tv(l, k - 1), v(2, k-2) = Tv{2, k-1), ..., 

v(in^_i - mfc_2. k-2) = Tv(m^_i - m^.g, fc - 1) 

Again by the preceding problem, 

Si = {Ml, ..., u^^_^, v{l,k-2), ..., ■u(mfc_i-w^^2. ^^-2)} 

is a linearly independent subset of W|c-2 which we can extend to a basis of TFfc_2 by adjoining 
elements 

vim^-.i- 711^-2 + 1, k-2), ■y(mfc_i-mfc_2+2, fc-2), ..., vim^^^-ink-s, ^-Z) 

Continuing in this manner we get a new basis for V which for convenient reference we arrange 
as follows: 

v{l, k), ..., 'y(mfc — ■TOfc_i, A;) 

v{l,k-l), ..., v{m^-mk_i,k-l), ..., •u(mfc_i - Wfc-a, fe - 1) 

i;(l, 2), ..., v(mfc-mfe_i, 2), ..., •u(mfc_i - ■mfc_2, 2), ..., ^(ma-mi, 2) 

v(l, 1), ..., i;(mfc-mfc_i, 1), ..., •u(mfe_i -mfe_2, 1), ..., i;(m2-mi, 1), ..., v(mi, 1) 

The bottom row forms a basis of Wi, the bottom two rows form a basis of W2, etc. But what is 
important for us is that T maps each vector into the vector immediately below it in the table or into 
if the vector is in the bottom row. That is, 

(v{i, j — 1) for } > 1 



Tv(i,i) = ^ . ■ . 

for ) = 1 

Now it is clear (see Problem 10.13(iv)) that T will have the desired form if the v{i,]) are ordered 
lexicographically: beginning with v(l, 1) and moving up the first column to ^(l, k), then jumping to 
v{2, 1) and moving up the second column as far as possible, etc. 

Moreover, there will be exactly 

m^, — mfc_i diagonal entries of order k 

(mfc_i — m;;_2) — (m^ — mfc_i) = 2mfc_i — m^ — Wfc_2 diagonal entries of order fc — 1 



2m2 — mi — m^ diagonal entries of order 2 

2mi — 7^2 diagonal entries of order 1 



as can be read off directly from the table. In particular, since the numbers mj, . . . , m^ are uniquely 
determined by T, the number of diagonal entries of each order is uniquely determined by T. Finally, 
the identity 

■mi = (mfc-mfc_i) + (2mfc_i -m^ -TOfc-a) + ••• + (2m2-mi-m3) + (2mi-m2) 

shows that the nullity mj of T is the total number of diagonal entries of T. 



1 1 1\ /O 1 1 1 

0011l\ /ooooo 



10.17. Let A = 



and A3 = 0; 



Then A^ = 

00000/ looooo 

^00000/ \0 0/ 

hence A is nilpotent of index 2. Find the nilpotent matrix M in canonical form 
which is similar to A. 



CHAP. 10] 



CANONICAL FORMS 



237 



Since A is nilpotent of index 2, M contains a diagonal block of order 2 and none greater than 
2. Note that rank A - 2; hence nullity of A = 5 - 2 = 3. Thus M contains 3 diagonal blocks. 
Accordingly M must contain 2 diagonal blocks of order 2 and 1 of order 1; that is, 

1 j 

_0_^j^_0^ 

M= OOlOllo 

|_0_^ '_0 

I 



10.18. Prove Theorem 10.11, page 226, on the Jordan canonical form for an operator T. 

By the primary decomposition theorem, T is decomposable into operators T-^, . . ., T^, i.e. 
r = Ti © • • • © r^, where (t — Xj)"»i is the minimal polynomial of Tj. Thus in particular, 

(Ti - Xi/)"«, = 0, ..., (T^-\J)r«-r = (i 

Set Ni = Ti- Xj7. Then for i = l r, 

Ti = Ni+ \I, where Nr't = 

That is, Tj is the sum of the scalar operator Xj/ and a nilpotent operator iV{, which is of index mj 
since (t — Xj)*"! is the minimal polynomial of Tf. 

Now by Theorem 10.10 on nilpotent operators, we can choose a basis so that iVj is in canonical 
form. In this basis, Ti = N^ + \I is represented by a block diagonal matrix Mj whose diagonal 
entries are the matrices J^. The direct sum J of the matrices Mj is in Jordan canonical form and, 
by Theorem 10.5, is a matrix representation of T, 

Lastly we must show that the blocks Jy satisfy the required properties. Property (i) follows 
from the fact that A^j is of index mj. Property (ii) is true since T and J have the same character- 
istic polynomial. Property (iii) is true since the nullity of Ni= Tj — \I is equal to the geometric 
multiplicity of the eigenvalue Xj. Property (iv) follows from the fact that the Tj and hence the N^ 
are uniquely determined by T. 



10.19. Determine all possible Jordan canonical forms for a linear operator T:V ->V whose 
characteristic polynomial is A{t) — (t — 2)^(t — 5)^. 

Since t — 2 has exponent 3 in A(t), 2 must appear three times on the main diagonal. Similarly 
5 must appear twice. Thus the possible Jordan canonical forms are 



2 11 
2 1 
2 



(i) 



5 1 
5 



2 I 

1 



2 I 



!_■ 



(ii) 



5 1 






(iii) 



5 1 
5 



2 1 
2 1 
2 



2 1 I 

2,' 



L?.i 



I 5 






5 1 

— -( 

I 5 



(iv) 



(V) 



(vi) 



238 



CANONICAL FORMS 



[CHAP. 10 



10.20. Determine all possible Jordan canonical forms / for a matrix of order 5 whose 
minimal polynomial is in(t) — {t — 2y. 

J must have one Jordan block of order 2 and the others must be of order 2 or 1. Thus there 
are only two possibilities: 



2 1 I 

I 



2 1 I 



J = 



.i^_. 



I 2 1 I 
' 2 



i ^ I 



!__ + __ 
I 2 



Note that all the diagonal entries must be 2 since 2 is the only eigenvalue. 



QUOTIENT SPACE AND TRIANGULAR FORM 

10.21. Let W he a subspace of a vector space V. Show that the following are equivalent: 

(i) uGv + W, (ii) u-v GW, (iii) v Gu + W. 

Suppose uG v + W. Then there exists w^eW such that u = v + Wq. Hence u — v = WoSW. 
Conversely, suppose u — vGW. Then u — v = Wq where Wq S 1^. Hence u = v + WgSv + W. 
Thus (i) and (ii) are equivalent. 

We also have: u — vGW iff — (m — v)=v — uGW iflf v & u+ W. Thus (ii) and (iii) are 
also equivalent. 

10.22. Prove: The cosets of PF in V partition V into mutually disjoint sets. That is: 

(i) any two cosets u + W and v + W are either identical or disjoint; and 

(ii) each v gV belongs to a coset; in fact, v Gv + W. 

Furthermore, u + W — v + W if and only if u — vGW, and so {v + w) + W = v + W 
for any w GW. 

Let 1) e V. Since G W, we have v = v + E v + W which proves (ii). 

Now suppose the cosets u+W and v + W are not disjoint; say, the vector « belongs to both 
u+W and v + W. Then u — xGW and x — vGW. The proof of (i) is complete if we show that 
u + W = v + W. Let M + Wq be any element in the coset u+W. Since u — x, x — v and Wq belong 

to W, 

(u + Wq) — V = (u — x) + {x — v) + Wo S W 

Thus u + W(,Gv + W and hence the coset u+W is contained in the coset v + W. Similarly v + W 
is contained in u+ W and so u+ W = v + W. 

The last statement follows from the fact that u+W - v + W if and only if uGv + W, and 
by the preceding problem this is equivalent to u — v G W. 



10.23. Let W be the solution space of the homo- 
geneous equation 2x + By + 4:Z = 0. De- 
scribe the cosets of W in R^. 

TF is a plane through the origin O = (0, 0, 0), 
and the cosets of W are the planes parallel to W. 
Equivalently, the cosets of W are the solution sets 
of the family of equations 

2x + Sy + 4z = k, kGR 

In particular the coset v + W, where v = (a, b, c), 
is the solution set of the linear equation 

2x + Sy + Az = 2a + 36 + 4c 
or 2(x -a) + 3(y - 6) + 4(2 - c) = 




CHAP. 10] CANONICAL FORMS 239 

10.24. Suppose W is a subspace of a vector space V. Show that the operations in Theorem 
10.15, page 229, are well defined; namely, show that if u + W = u' + W and v + W - 
v' + W, then 

(i) {u + v) + W = {u' + V') + W and (ii) ku + W = ku' + W, for any k&K 

(i) Since u + W ^ u' + W and v + W = v' + W, both u — u' and v — v' belong to W. But then 
(u + v) - (u' + v') - {u- u') + {v- v') e W. Hence (u + v) + W = (m' + v') + W. 

(ii) Also, since u — u' S W implies k(u — u') G W, then ku — ku' = k(u — u') G W; hence 
ku+W = ku' + W. 



10.25. Let F be a vector space and W a subspace of V. Show that the natural map ij : F -» V/W, 
defined by rj{v) = v + W, is linear. 

For any u,v ^ V and any k G K, we have 

v{u + v) = u + V + W - u + W + V + W = v{u) + v{v) 
and v{kv) = kv + W = k(v + W) = k ri(v) 

Accordingly, r) is linear. 

10.26. Let W he a subspace of a vector space V. Suppose {wi, . . . , Wr} is a basis of W and 
the set of cosets {vi, . . . , Vs}, where Vj = Vj + W, is a basis of the quotient space. 
Show that B = {vi, . . .,Vs, Wi, . . . , Wr} is a basis of V. Thus dim V = dim W + 
dim (7/TF). 

Suppose M e y. Since {■5^} is a basis of V/W, 

u = u + W — di'i'i + a.2'U2 + • • ■ + ttj^s 
Hence u — aiVy + • • • + a^v^ + w where w G W. Since {w;} is a basis of W, 

u — a^Vi + • • • + a^Vg + bjWi + • • ■ + b^w^ 
Accordingly, B generates V. 

We now show that B is linearly independent. Suppose 

e^Vi + • ■ ■ + CgVs + djWi + • • • + dfWr = (1) 

Then Cj'Di + ■ • • + c^Vs - = W 

Since {Vj} is independent, the c's are all 0. Substituting into (1), we find djWi + ■ • • + d^w^ = 0. 
Since {wj is independent, the d's are all 0. Thus B is linearly independent and therefore a basis 
of y. 



10.27. Prove Theorem 10.16: Suppose W is a subspace invariant under a linear operator 
T:V^V. Then T induces a linear operator f on V/W defined by f{v + PF) = 
T{v) + W. Moreover, if T is a zero of any polynomial, then so is T. Thus the mini- 
mum polynomial of T divides the minimum polynomial of T. 

We first show that f is well defined, i.e. if u+W = v + W then t(u+W) = f(v + W). If 
u+W = v + W then u-vGW and, since W is T-invariant, T(u -v) = T(u) - T{v) G W. 

Accordingly, 

T{u+W) = T{u) + W = T(v) + W = T(v + W) 
as required. 

We next show that t is linear. We have 

t{{u +W) + (v + W)) = f(u + v + W) = T(u + v) + W = T(u) + Tiv) + W 

= T(u) + W+ T(v) + W = f{u + W) + T(v + W) 
and 

f{k{u + W)) = f(ku + W) = T{ku) + W= kT(u) + W = k{T(u) + W) = kf(u+ W) 

Thus f is linear. 



240 CANONICAL FORMS [CHAP. 10 

Now, for any coset m + IF in VIW, 
f2(u+W) = THu) + W = T(T(u)) + W = t{T{u) + W) ^ t(f{u+W)) = t^u+W) 

Hence T^ — T^. Similarly T" = T" for any n. Thus for any polynomial 

/(«) = a„t" + • • • + ao = 2 afi, 

HT)(u+W) = f(T)(u) + W = '^.a^Tiiu) + W = "2 diiTKiA + '^) 

= ^ajFCw+W) = ^aifi(u+W) = {^ a.ifi)(u + W) = f{f)(u+W) 

and so 7(r) = /(r). Accordingly, if T is a root of f{t) then 7(f) = = W = /(f), i.e. f is also a 
root of f(t). Thus the theorem is proved. 

10.28. Prove Theorem 10.1: Let T .V -^V be a linear operator whose characteristic poly- 
nomial factors into linear polynomials. Then V has a basis in which T is represented 
by a triangular matrix. 

The proof is by induction on the dimension of V. If dim V = 1, then every matrix representa- 
tion of r is a 1 by 1 matrix which is triangular. 

Now suppose dim V — n> 1 and that the theorem holds for spaces of dimension less than n. 
Since the characteristic polynomial of T factors into linear polynomials, T has at least one eigen- 
value and so at least one nonzero eigenvector v, say T(v) — a^^v. Let W be the 1-dimensional sub- 
space spanned by v. Set V = VIW. Then (Problem 10.26) dim V = dim V — dim W = to - 1. Note 
also that W is invariant under T. By Theorem 10.16, T induces a linear operator f on V whose 
minimum polynomial divides the minimum polynomial of T. Since the characteristic polynomial of 
r is a product of linear polynomials, so is its minimum polynomial; hence so are the minimum 
and characteristic polynomials of f. Thus V and f satisfy the hypothesis of the theorem. Hence, 
by induction, there exists a basis {v^, . . . , ■0„} of V such that 

f(vs) = as2.V2 + assVs 



Now let V2, ■ ■ -tVn be elements of V which belong to the cosets 1)2, . . . , «„ respectively. Then 
{v,V2, ...,vj is a basis of V (Problem 10.26). Since f(v2) - a22'V2, we have 

f («2) — a'22^2 = and so 2X^2) "" «22'"2 ^ ^ 
But W is spanned by v; hence T{v2) — a22'''2 is a multiple of v, say 

T(V2) — a'22V2 — a2i'" and so T^kv^) — "21^ + «22'"2 

Similarly, for t = 3 n, 

T{Vi) - ai2V2 - Ojs-ys - • • • - a^Vi e W and so T{Vi) = a^v + 0(2^2 -!-•••+ a^iV^ 

Thus T(v) = a,iv 

T{V2) = a2iV + 0.22^2 



T(Vn) = a„iV + a„2«2 + ■ • • + ann-yn 

and hence the matrix of T in this basis is triangular. 

CYCLIC SUBSPACES, RATIONAL CANONICAL FORM 

10.29. Prove Theorem 10.12: Let Z{v, T) be a T-cyclic subspace, T^ the restriction of T to 

Z{v,T), and m„(«) ^ t^ + 0.^-1*"-' + ■ ■ ■ + Oo the T-annihilator of v. Then: 
(i) The set {v, T{v), ..., r'=-i(v)} is a basis of Z{v,T); hence dimZ(t;,r) = k. 
(ii) The minimal polynomial of Tv is TO„(f). 
(iii) The matrix of T« in the above basis is 



CHAP. 10] CANONICAL FORMS 241 









. 


. 


— fto 


1 





. 


. 


-ai 








. 


. 


— afc-2 








. 


. 1 


— Ctlc-l 



(i) By definition of m^Ct), T''{v) is the first vector in the sequence v, T{v), T^v), . . . which is a 
linear combination of those vectors which precede it in the sequence; hence the set 
B — {v, T{v), . . ., r''-i(i;)} is linearly independent. We now only have to show that Z(v, T) = 
L(B), the linear span of B. By the above, T^v) e L{B). We prove by induction that 
T^{v) &L(B) for every n. Suppose n>k and T^-^(v) E. L(B), i.e. !r"-i(v) is a linear com- 
bination of V, . ..,T^-i{v). Then r"(v) = r(r«-i(v)) is a linear combination of T{v), . . -tT^v). 
But THv) G L{B); hence T^{v) £ L(B) for every n. Consequently f(T)(v) G L(B) for any 
polynomial /(<). Thus Z{v, T) = L(B) and so B is a basis as claimed. 

(ii) Suppose m(t) = i* + 6j_i<«~i + • • • + &o is the minimal polynomial of r„. Then, since 

■we (v, ), ^ ^ m(T^){v) = m(T){v) = T^(v) + h^^iT^--^(v) + • • ■ + h^v 

Thus T^{v) is a linear combination of v,T{v), . . ., T^-i{v), and therefore k ^ s. However, 
m„(r) = and so m^(T^) = 0. Then m(t) divides m„(t) and so s — k. Accordingly fc = s and 
hence w„(t) = tn{t). 

(iii) T„(v) = T(v) 

T„{T{v)) = THv) 

T„{T''-Hv)) = T^v) = -oov - a^T{v) - a^n(v) a^^^n'Hv) 

By definition, the matrix of T„ in this basis is the transpose of the matrix of coefficients of the 
above system of equations; hence it is C, as required. 



10.30. Let r : y -» y be linear. Let TF be a T-invariant subspace of V and T the induced 
operator on VIW. Prove: (i) The T-annihilator of v G V divides the minimal poly- 
nomial of T. (ii) The T-annihilator of v G VIW divides the minimal polynomial of T. 

(i) The r-annihilator of v GV is the minimal polynomial of the restriction of T to Z(v, T) and 
therefore, by Problem 10.6, it divides the minimal polynomial of T. 

(ii) The f-annihilator of ■p S VIW divides the minimal polynomial of f, which divides the minimal 
polynomial of T by Theorem 10.16. 

Remark. In case the minimal polynomial of T is /(t)" where f(t) is a monic irreducible poly- 
nomial, then the T-annihilator of v G V and the T-annihilator of i) G VIW are of the form /(t)™ 
where m — n. 



10.31. Prove Lemma 10.13: Let T : F -* V be a linear operator whose minimal polynomial 
is /(t)" where f{t) is a monic irreducible polynomial. Then V is the direct sum of 
T-cyclic subspaces Zi = Z{vi, T), i = l, . . ., r, with corresponding T-annihilators 

/(f)"i, /{f)% . . . , /(i)"', n = ni ^ »2 - • • • - «r 

Any other decomposition of Y into the direct sum of T-cyclic subspaces has the same 
number of components and the same set of T-annihilators. 

The proof is by induction on the dimension of V. If dim V = 1, then V is itself T-cyclic and 
the lemma holds. Now suppose dim V > 1 and that the lemma holds for those vector spaces of 
dimension less than that of V. 



242 CANONICAL FORMS [CHAP. 10 

Since the minimal polynomial of T is /(<)", there exists v^GV such that f{T)^~i(vi) ¥= 0; 
hence the r-annihilator of Vi is /(«)". Let Zi = Z{vi,T) and recall that Zi is T-invariant. Let 
V = V/Zi and let f be the linear operator on V induced by T. By Theorem 10.16, the minimal poly- 
nomial of f divides /(<)"; hence the hypothesis holds for V and f. Consequently, by induction, V is 
the direct sum of f-cyclic subspaces; say, 

V = Z(%, r) e • • • © Z(v„ f) 

where the corresponding f-annihilators are f{t)"2, . . ., /(*)">■, ti — n2 — • • • — n^ 

We claim that there is a vector V2 in the coset V2 whose T-annihilator is /(<)"2, the T-annihilator 

of V2. Let w be any vector in Dg- Then f{T)"i (w) e Z^. Hence there exists a polynomial g(t) for 

which 

f(T)n.{w) = 9{T){Vi) (1) 

Since /(<)" is the minimal polynomial of T, we have by (1), 

= f{T)^{w) = f(T)«-n2g{T)(vi) 

But /(«)" is the r-annihilator of 1^1 ; hence /(i)" divides f(t)«-''2 g(t) and so g{t) = f(t)«2 h(t) for 

some polynomial h(t). We set , ,„r^ , > 

■U2 = w - h{T) (vi) 

Since w — ■Uj = 'i(^) (^1) ^ ^1. ^2 also belongs to the coset ■82- Thus the T-annihilator of V2 is a 
multiple of the if-annihilator of V2- On the other hand, by (1), 

f{T)«^{v2) = f(T)'H(w-h(T)(vi)) = f(T)^(w) - g{T){vi) = 
Consequently the T-annihilator of v^ is /(t)"2 as claimed. 

Similarly, there exist vectors v^, . . .,Vf&V such that ViGvl and that the T-annihilator of 
Vj is /(t)"i, the f-annihilator of ^. We set 

Z2 = Z{V2,T), ..., Z, = Z(i;„r) 

Let d denote the degree of /(t) so that /(*)"« has degree dwj. Then since /(t)"! is both the r-annihilator 
of Vi and the f-annihilator of v^, we know that 

{Vi, T(v,), ..., T*"i- 1 (Vi)} and {% f(v^, . . . , f d^ii- 1 (iTj)} 
are bases for Z(Vi, T) and Z(v^, f) respectively, for i = 2,...,r. But V = Z(v^ f) © ••• © 
Z(vZ, t); hence . .s^ . ,, 

is a basis for V. Therefore by Problem 10.26 and the relation fi(v) = THv) (see Problem 10.27), 

{Vi, ..., rd"i-l(Vi), V2, .... Tdr^^-HV2), ...,V„ ..., Td-r-l (i;^)} 

is a basis for V. Thus by Theorem 10.4, V = Z(vi, T) ® •■• © Z(v^, T), as required. 

It remains to show that the exponents Wj, . . . , w^ are uniquely determined by T. Since d denotes 
the degree of fit), 

dimy = d{nx^ h n^) and dimZj = drii, i = l,...,r 

Also, if s is any positive integer then (Problem 10.59) f(T)^(Z>i is a cyclic subspace generated by 
f{T)s(Vi) and it has dimension d(Mj-s) if «i > s and dimension if Wi - s. 

Now any vector v &V can be written uniquely in the form v = w-^+ • • • + Wr where w^ £ Zj. 
Hence any vector in f(T)^{V) can be written uniquely in the form 

f(T)Hv) = f{Ty(Wi)+ ■■■ +f(T)s(w,) 
where /(r)«(Wj) € f(Ty{Zi). Let t be the integer, dependent on s, for which 

%1 > S, ..., Jlt > S, Wt + i - s 

Then f(Ty{V) = f(T)HZi) © ••• © /(DM^t) 

and so dim(/(r)'(V)) = d[{ni - s) + • ■ ■ + {n^ - s)] (*) 

The numbers on the left of (*) are uniquely determined by T. Set s = re — 1 and (*) determines the 
number of TOj equal to re. Next set s = re — 2 and (*) determines the number of re, (if any) equal to 
n-1. We repeat the process until we set s = and determine the number of % equal to 1. Thus 
the Wi are uniquely determined by T and V, and the lemma is proved. 



CHAP. 10] 



CANONICAL FORMS 



243 



10.32. Let F be a vector space of dimension 7 over R, and let T.V^V be a linear operator 
with minimal polynomial m{t) = {t^ + 2){t + 3f. Find all the possible rational 
canonical forms for T. 

The sum of the degrees of the companion matrices must add up to 7. Also, one companion 
matrix must be t^ + 2 and one must be (t + 3)3. Thus the rational canonical form of T is exactly one 
of the following direct sums of companion matrices: 

(i) C(t2 + 2) © C(t2 + 2) © C((t + 3)3) 

(ii) C(«2 + 2) © C((t + 3)3) © C((t + 3)2) 

(iii) C(t2 + 2) © C((« + 3)3) © C(t + 3) © C(t + S) 



That is, 



^0 -2 







(i) 




-2 






\ 


/° 


-2 






\ 


A 










-27 







-27 


1 


-27 




1 


-27 





1 -9 

-9 / 

1 -6/ 


V 





1 -9 



-3 



-3/ 



(ii) 



(iii) 



PROJECTIONS 

10.33. Suppose V = Wi® • • • ® Wr. The projection of V into its subspace Wk is the map- 
ping E:V ^ V defined by E{v) = Wk where v — wi+ ■ ■ • + Wr, Wi e Wi. Show 
that (i) E is linear, (ii) E^ = E. 

(i) Since the sum v = Wi+ ■ ■ ■ + w^, WiG W is uniquely determined by v, the mapping E is well 
defined. Suppose, for m S V, u ~ w^i- • ■ ■ + w^, w[ S W^. Then 

V -{- u = (wi + w() + • • • + (Wr + w'r) and A:t; = fcwj + • • • + kw^, kwf, Wj + w,' G PFj 

are the unique sums corresponding to v + u and kv. Hence 

E(v + u) = Wk + wl^ = E{v) + E(u) and E(kv) = kw^ = kE(v) 

and therefore £7 is linear. 

(ii) We have that w^ = 0+---+0 + TOfc + 0+---+0 

is the unique sum corresponding to w^ G Wk'-, hence E(w^) = w^. Then for any v G V, 

EHv) = E(E(v)) = E(Wk) = w^ = E(v) 
Thus E^ = E, as required. 



10.34. Suppose E:V-*V is linear and E^ - E. Show that: (i) E(u) = u for any uGlmE, 
i.e. the restriction of E to its image is the identity mapping; (ii) V is the direct sum 
of the image and kernel of E: V — ImE @ KerE; (iii) E is the projection of V into 
Im E, its image. Thus, by the preceding problem, a linear mapping T : V -> V is a 
projection if and only if T^ = T; this characterization of a projection is frequently 
used as its definition. 

(i) If M G Im E, then there exists v e V for which E(v) = u; hence 

E(u) = E{E{y)) = EHv) = E{v) = u 
as required. 

(ii) Let V SV. We can write v in the form v = E{v) + v — E(v). Now E(v) e.ImE and, since 

E(v - ^(1;)) = E(v) - E^(v) = E{v) - E(v) = 
V — E(v) G Ker E. Accordingly, V = Im jE? + Ker E. 



244 CANONICAL FORMS [CHAP. 10 

Now suppose w GImE n Ker E. By (i), E{w) = w because w G Im £/. On the other 
hand, E{w) = because w G Ker E. Thus w = and so ImE n Ker E = {0}. These two 
conditions imply that V is the direct sum of the image and kernel of E. 

(iii) Let v eV and suppose v = u + w where uGlmE and w e Ker E. Note that E(u) = u 
by (i), and E{w) — because w G Kerfi". Hence 

£?(-!;) = E{u + w) = E(u) + E{w) = u + = m 
That is, E is the projection of V into its image. 

10.35. Suppose V = U®W and suppose T:V-^V is linear. Show that U and TF are both 
r-invariant if and only if TE - ET where E is the projection of V into U. 

Observe that E{v) G U for every t> G Y, and that (i) E(v) = v iff v€:U, (ii) K(i;) = 
iff v&W. 

Suppose ET = TE. Let u G U. Since E{u) = w, 

r(M) = T{E(u)) = (TE)(u) = {ET){u) = E(T(u)) G U 

Hence U is T-invariant. Now let w GW. Since E{w) = 0, 

E{T(w)) = (ET){w) - (TE){w) = T{E{w)) = r(0) = and so T(w) G W 

Hence W is also T-invariant. 

Conversely, suppose U and W are both T-invariant. Let ■« G Y and suppose •y = w + w where 
M G r and w G T^. Then ?(«) G i7 and r(w) G W; hence S(r{w)) = TM and E(T(w)) = 0. 

'^^'"^ (ET){v) = (Br)(M + TO) = (Er)(M) + (Br)(w) = E(T(u)) + E{T{w)) = T(u) 

and {TE)(v) = (r£7)(M + w) = r(S(M + «;)) = T{u) 

That is, (ET){v) = (TE)(v) for every -y G F; therefore ET = TE as required. 



Supplementary Problems 

INVARIANT SUBSPACES 

10.36. Suppose W is invariant under T:V -^V. Show that W is invariant under f{T) for any polynomial 

10.37. Show that every subspace of V is invariant under I and 0, the identity and zero operators. 

10.38. Suppose W is invariant under S:V^V and T : V -^ F. Show that W is also invariant under 
S + r and ST. 

10.39. Let r : y -> y be linear and let W be the eigenspace belonging to an eigenvalue X of T. Show that 
W is r-invariant. 

10.40. Let y be a vector space of odd dimension (greater than 1) over the real field E. Show that any 
linear operator on V has an invariant subspace other than V or {0}. 

10.41. Determine the invariant subspaces of A = f _A viewed as a linear operator on (i) R2, (ii) C^. 

10.42. Suppose dim V = n. Show that T:V^V has a triangular matrix representation if and only if 
there exist T-invariant subspaces WiCWzC ■ ■ ■ cW^ = V for which dim TFfc = fc, k = l,...,n. 

INVARIANT DIRECT-SUMS 

10 43 The subspaces Wi,...,Wr are said to be independent if Wi + • • • -|- w^ = 0, Wj G Wi, implies that 
each Wi = 0. Show that L(Wd = Wi® ■■■ @Wr if and only if the Wi are independent. (Here 
LiWi) denotes the linear span of the Wi-) 

10.44. Show that V = Wi® ■•■ ®Wr if and only if (i) V = L{Wi) and (ii) W^nL{Wi, . . .,Wk-i. 
W^ + i,...,Wr) = {0}, fe = l, ...,r. 

10.45. Show that L(Wi) = W,®---®Wr if and only if dimLd^i) = dim Wi + • • • -t- dim W^. 






1 . 


. 





1 . 


. 





. 


. 1 





. 


. 



. 


. 


1 . 


. 


1 . 


. 



CHAP. 10] CANONICAL FORMS 245 

10.46. Suppose the characteristic polynomial of T : V -» V is A(t) = /i(f)"i /2(f)»2 . . . fAt)«' where the 
fi(t) are distinct monic irreducible polynomials. Let V = WiQ ■■• ®Wr be the primary decom- 
position of V into r-invariant subspaces. Show that /((t)™! is the characteristic polynomial of the 
restriction of 7" to PFj. 

NILPOTENT OPERATORS 

10.47. Suppose S and T are nilpotent operators which commute, i.e. ST = TS. Show that S+T and ST 
are also nilpotent. 

10.48. Suppose A is a supertriangular matrix, i.e. all entries on and below the main diagonal are 0. Show 
that A is nilpotent. 

10.49. Let V be the vector space of polynomials of degree - n. Show that the differential operator on V 
is nilpotent of index n + 1. 

10.50. Show that the following nilpotent matrices of order n are similar: 



and 

\0 ... 1 0/ 

10.51. Show that two nilpotent matrices of order 3 are similar if and only if they have the same index 
of nilpotency. Show by example that the statement is not true for nilpotent matrices of order 4. 

JORDAN CANONICAL FORM 

10.52. Find all possible Jordan canonical forms for those matrices whose characteristic polynomial A(t) 
and minimal polynomial m{t) are as follows: 

(i) A(t) = (t-2)4(f-3)2, m{t) = (t-2)2(t-3)2 

(ii) A{t) = (t-7)5, m(t) = (t-7)2 

(iii) A(t) = (t-2)7, m(t) = (t-2)3 

(iv) A(t) ^ (t-3)*(t-5)\ m(t) =: («-3)2(i-5)2 

10.53. Show that every complex matrix is similar to its transpose. {Hint. Use Jordan canonical form and 
Problem 10.50.) 

10.54. Show that all complex matrices A of order n for which A" = / are similar. 

10.55. Suppose A is a complex matrix with only real eigenvalues. Show that A is similar to a matrix with 
only real entries. 

CYCLIC SUBSPACES 

10.56. Suppose T:V -*V is linear. Prove that Z{v, T) is the intersection of all T-invariant subspaces 
containing v. 

10.57. Let fit) and g{t) be the T-annihilators of u and v respectively. Show that if f(t) and g{,t) are rel- 
atively prime, then f(t)g(t) is the T-annihilator of u + v. 

10.58. Prove that Z(m, T) = Z(v, T) if and only if g(T)(u) = v where g(f) is relatively prime to the 
r-annihilator of u. 

10.59. Let W = Z{v, T), and suppose the T-annihilator of v is /(*)" where f(t) is a monic irreducible poly- 
nomial of degree d. Show that f{T)^{W) is a cyclic subspace generated by f(Ty(v) and it has dimen- 
sion d{n — s) if n > s and dimension if n — s. 

RATIONAL CANONICAL FORM 

10.60. Find all possible rational canonical forms for: 

(i) 6X6 matrices with minimum polynomial m(t) = (t^ + 3){t + 1)2 

(ii) 6X6 matrices with minimum polynomial mit) = (t + 1)3 

(iii) 8 X 8 matrices with minimum polynomial m(t) = {t^ + 2)^(t + Z)^ 

10.61. Let A be a 4 X 4 matrix with minimum polynomial m(t) = (t^ + \){fi — 3). Find the rational ca- 
nonical form for A if A is a matrix over (i) the rational field Q, (ii) the real field B, (iii) the com- 
plex field C. 



246 CANONICAL FORMS [CHAP. 10 




10.62. Find the rational canonical form for the Jordan block 



10.63. Prove that the characteristic polynomial of an operator T :V -^ V is a product of its elementary 
divisors. 

10.64. Prove that two 3X3 matrices with the same minimum and characteristic polynomials are similar. 

10.65. Let C(f(t)) denote the companion matrix to an arbitrary polynomial f(t). Show that f(t) is the char- 
acteristic polynomial of C{f(t)). 

PROJECTIONS 

10.66. Suppose V = Wi® ■■■ ®Wr- Let Ei denote the projection of V into Wi- Prove: (i) EiE^ = 0, 
i^j; (ii) I = E^+ •■■ +E^. 

10.67. Let El, .. .,Er be linear operators on V such that: (i) Ef = £/;, i.e. the Ei are projections; 
(ii) EiEj = 0, i ^ i; (iii) J = Bj H +E^. Prove that V = Im Ej © • • • © Im B^- 

10.68. Suppose E -.V ^V is a projection, i.e. E^ = E. Prove that E has a matrix representation of the 
form ( ^ ) where r is the rank of E and /^ is the r-square identity matrix. 

10.69. Prove that any two projections of the same rank are similar. {Hint. Use the result of Problem 
10.68.) 

10.70. Suppose E -.V -^V is a projection. Prove: 

(i) I-E isa projection and V = ImE ® Im (I-E); (ii) I + E is invertible (if 1 + 1 # 0). 

QUOTIENT SPACES 

10.71. Let IF be a subspace of V. Suppose the set of cosets {vi + W,V2 + W, ...,■«„+ IF} in V/W is 
linearly independent. Show that the set of vectors {v^, V2, ..., vj in V is also linearly independent. 

10.72. Let IF be a subspace of V. Suppose the set of vectors {mi, Wg m„} in V is linearly independent, 

and that L(Ui) nW = {0}. Show that the set of cosets {mi + IF, . . . , m„ + IF} in V/W is also 
linearly independent. 

10.73. Suppose V = U ®W and that {mj, . . . , tt„} is a basis of U. Show that {ui + W, . . . , ii„ + IF} is 
a basis of the quotient space V/W. (Observe that no condition is placed on the dimensionality of 
V or IF.) 

10.74. Let IF be the solution space of the linear equation 

ajXi + 020:2 + • • • + an^n = 0. O'i^ K 

and let v = (5i, 63 6„) S X". Prove that the coset v + IF of IF in K" is the solution set of the 

linear equation , , , , i._„j,j^ .a. „ h 

aiXi + a2X2 + • • • + a^Xn = b where 6 - ajfti + • • • + a„6„ 

10.75. Let V be the vector space of polynomials over R and let IF be the subspace of polynomials divisible 
by t*, i.e. of the form Ugt* + a^t^ -\ h a„_^t^. Show that the quotient space V/W is of dimension 4. 

10.76. Let U and IF be subspaces of V such that WcUcV. Note that any coset «, + IF of IF in ?7 may 
also be viewed as a coset of IF in F since u&U implies w e V; hence U/W is a subset of V/W. 
Prove that (i) ?7/IF is a subspace of V/W, (ii) dim {V/W) - dim(i7/IF) = dim{V/U}. 

10.77. Let U and IF be subspaces of V. Show that the cosets of UnW in F can be obtained by inter- 
secting each of the cosets of t/ in F by each of the cosets of IF in V: 

V/{UnW) = {{v+U)n{v'+W): v,v' eV} 

10.78. Let T:V ^V be linear with kernel IF and image U. Show that 
the quotient space V/W is isomorphic to U under the mapping 
e : V/W -> U defined by e{v -I- IF) = T{v). Furthermore, show that v 
T = io 0OTJ where i; : F -> V/W is the natural mapping of V into 
V/W, i.e. r,{v) = -y -t- IF, and t : U C V is the inclusion mapping, yjy^ 
i.e. i(u) = u. (See diagram.) 




CHAP. 10] 



CANONICAL FORMS 



247 



Answers to Supplementary Problems 

10.41. (i) R2 and {0} (ii) C\ {0}, PTj = L((2, 1 - 2i)), W^ = L{{2, 1 + 2i)) • 



10.52. (i) 



2 1 
2 



2 1 I 
2 I 

r 



3 1 
3 



2 1 I 



4_. 



L2_. 



^2^ 

i_lL 



i 3 1 
I 3 



(ii) 



7 1 i 
7 I 



7 1' 

' 7 I 

L_i4-- 

I 7 



7 1 ; 

7 I 



111. 



7 I 
"[7- 



(iii) 



2 1 I 

2 I 



2 1 



2 1 



H 



I 2 



2 1 
2 1 
2 



2 1 



-I 



I 2 1 
I 2 



2 1 I 

2 1 I 

2 I 

r 



I 



2 1 I 
2 I 



111- 

I 2 / 



2 1 I 

2 1 I 

2 I 



2 I 



TiL. 



(iv) 




3 1 I 



Wi^ 



L. 



\ 

I 5 1 
I 5 



3 1 I 
3 I 

r 



3 1 I 



1 



5 1 I 

I 5 / 



3 1 I 
3 I 

r- 






[5 n 

' 5 I 

|5 1 
I 5 



3 1 I 
3 I 



L- - f- - 1 



3 I 



L"-' _ 



r5""i"i 



^-in_ 

I 5 / 



248 
10.60. (i) 



(ii) 



(iii) 



'0 -3 
1 



'0 0-1 
10-3 
1-3 




1 


2 


-4 

1 




CANONICAL FORMS 

-3 

1 

-1 

1 -2 

-1 

1 -2/ 

'0 0-1 
1 0-3 
1-3 



-9 

1 -6/ 



-1 

1 -2 



-1, 

2 


-4 

1 



[CHAP. 10 






-3 


1 







-1 




1 -2 



-9 

1 -6 



-1 






-1 


1 


-3 





1 -3 



-1 



-1, 













2 






1 


















1 





-4 












1 







1 


-9 
-6 



-3. 



10.61. (i) 




(ii) 



/O -1 



y/^ 



-W 



(iii) 



V^ 



-Vsl 



10.62. 



^0 -x*\ 

10 4\3 

10 -6\2 

\0 1 4X 



chapter 11 



Linear Functionals and the Dual Space 

INTRODUCTION 

In this chapter we study linear mappings from a vector space V into its field K of scalars. 
(Unless otherwise stated or implied, we view X as a vector space over itself.) Naturally 
all the theorems and results for arbitrary linear mappings on V hold for this special case. 
However, we treat these mappings separately because of their fundamental importance and 
because the special relationship of 7 to Z gives rise to new notions and results which do not 
apply in the general case. 

LINEAR FUNCTIONALS AND THE DUAL SPACE 

Let F be a vector space over a field K. A mapping <i>:V -* K is termed a linear func- 
tional (or linear form) if, for every u,v GV and every a,b G K, 

4,{au -^hv) — a 4,{u) + b <j>{v) 

In other words, a linear functional on F is a linear mapping from V into K. 

Example 11.1: Let jt-j : K" -» K be the ith projection mapping, i.e. 7rj(ai, aj, ...,a„) = a^ Then ttj 
is linear and so it is a linear functional on X". 

Example 11.2: Let V be the vector space of polynomials in t over R. Let ^ iV -^ R be the integral 

operator defined by ^{p(t)) = j p{t) dt. Recall that ^ is linear; and hence it is 
a linear functional on y. o 

Example 11.3: Let V be the vector space of n-square matrices over K. Let T iV ^ K be the trace 
mapping 

T{A) - «„ + a22 + • • • + «««. where A = (ay) 

That is, T assigns to a matrix A the sum of its diagonal elements. This map is 
linear (Problem 11.27) and so it is a linear functional on V. 

By Theorem 6.6, the set of linear functionals on a vector space V over a field K is also 
a vector space over K with addition and scalar multiplication defined by 

{(t> + (T){v) = (j>{v) + (t{v) and {!i<j>)iv) = k<j){v) 

where ^ and a are linear functionals on V and k G K. This space is called the diuil space of 
V and is denoted by V*. 

Example 11.4: Let V = K", the vector space of ti-tuples which we write as column vectors. Then 
the dual space V* can be identified with the space of row vectors. In particular, 
any linear functional <f> = {a^, . . . , a„) in y* has the representation 

<p{xi, . . .,Xn) = (tti.aa, . . .,a„) 



249 



250 LINEAR FUNCTIONALS AND THE DUAL SPACE [CHAP. 11 

or simply 

0(a;i «„) = a^Xi + 02*2 + • • • + In^n 

Historically, the above formal expression was termed a linear form. 



DUAL BASIS 

Suppose y is a vector space of dimension n over K. By Theorem 6.7, the dimension of 
the dual space V* is also n (since K is of dimension 1 over itself.) In fact, each basis of V 
determines a basis of V* as follows: 

Theorem 11.1: Suppose {Vi, . . .,v„} is a basis of V over K. Let ^j, . . .,^„ G V* be the 
linear functionals defined by 

^'(^^^ ^ ^« = jo if^^i 
Then {^i ^„} is a basis of V*. 

The above basis {<|>^) is termed the basis dtial to (Vi) or the dval basis. The above for- 
mula which uses the Kronecker delta Si, is a short way of writing 

4>^(vJ = 1, <l>,{v^) = 0, <i>^(v^) = 0, . . ., .I>,ivj - 

^^{v,) = 0, </,,(t;,) = 1, <l>,{v,) = 0, . . ., <l>,ivj - 



By Theorem 6.2, these linear mappings ^, are unique and well defined. 

Example Hii: Consider the following basis of R^: {v^ = (2,1), ■Uj - (3,1)}. Find the dual basis 
{^i> ^2}' 

We seek linear functionals ^i(a;, y) = ax + by and 02(*. y) = ex + dy such that 

^i('i;i) = 1, 01(^2) = 0, 02(^1) = 0, 9*2(^2) = 1 

Thus 0,(vi) = 0i(2, 1) = 2a + 6 = 11 ^^ a = -1, 6 = 3 

0iK) = 0j(3,l) = 3a + 6 = Oj 

02K) = 02(2,1) = 2c + d = Oj ^^ e = l,d = -i 
02(^2) = 02(3, 1) = 3c + d = ij 
Hence the dual basis is {4>i(x, y) = -x + Sy, 4>2(.x, y) = x - 2y}. 

The next theorems give relationships between bases and their duals. 

Theorem 11.2: Let {vi, . . . , v™} be a basis of V and let {<l>^, ...,4>Jhe the dual basis of V*. 
Then for any vector uGV, 

u = 4,^{u)v^ + 4,^{u)v^ + • • • + <t>Su)v^ 
and, for any linear functional a € V*, 

Theorem 11.3: Let {vi, ...,Vn} and {wi, ...,Wn} be bases of V and let {<^,, .. .,<#>„} and 
{ffj, . . . , o-„} be the bases of 7* dual to {Vi} and {Wi} respectively. Suppose 
P is the transition matrix from [Vi) to {Wi}. Then (P-i)* is the transition 
matrix from {4>^} to (o-.j. 



CHAP. 11] LINEAR FUNCTIONALS AND THE DUAL SPACE 251 

SECOND DUAL SPACE 

We repeat: every vector space V has a dual space F* which consists of all the linear 
functionals on V. Thus V* itself has a dual space V**, called the second dual of V, which 
consists of all the linear functionals on V*. 

We now show that each v GV determines a specific element v GV**. First of all, 
for any <j> GV* we define ^ 

vi<j>) = <l>iv) 

It remains to be shown that this map v.V* -^ K is linear. For any scalars a,b GK and 
any linear functionals ^, o- G V*, we have 

v(a(j> + ba) = (acf> + b(T)(v) — a <i>(v) + b a(v) - av{<j>) + bv{a) 
That is, V is linear and so v GV**. The following theorem applies. 

Theorem 11.4: If V has finite dimension, then the mapping v ^ v is an isomorphism of V 
onto V**. 

The above mapping v i^ t; is called the natural mapping of V into V**. We emphasize 
that this mapping is never onto F** if 7 is not finite-dimensional. However, it is always 
linear and, moreover, it is always one-to-one. 

Now suppose V does have finite dimension. By the above theorem the natural mapping 
determines an isomorphism between V and V**. Unless otherwise stated we shall identify 
V with V** by this mapping. Accordingly we shall view V as the space of linear functionals 
on V* and shall write V = V**. We remark that if {^J is the basis of V* dual to a basis 
{Vi} of V, then {vi} is the basis of V = V** which is dual to (^J. 



ANNIHILATORS 

Let W he a subset (not necessarily a subspace) of a vector space V. A linear functional 
^gV* is called an annihilator of W if 4>{w) = for every w GW, i.e. if <j>{W) = {0}. 
We show that the set of all such mappings, denoted by W and called the annihilator of W, 
is a subspace of V*. Clearly G W^. Now suppose ^, <r G W^. Then, for any scalars 
a,b gK and for any w GW, 

(a<ji + b<j){w) = a^(w) + b (t{w) — aO + bO = 
Thus a^ + baG W and so W is a subspace of V*. 

In the case that IF is a subspace of F, we have the following relationship between W and 
its annihilator W. 

Theorem 11.5: Suppose F has finite dimension and IF is a subspace of F. Then 
(i) dim W + dim IF" = dim F and (ii) TF»» = W. 

Here W" = {vGV: ^(v) = for every ^ G W>} or, equivalently, IF"" = (TF")" where 
IF"" is viewed as a subspace of F under the identification of F and F**. 

The concept of an annihilator enables us to give another interpretation of a homogeneous 
system of linear equations, 

anXi + ai2X2 + • • • + ainXn — 

(*) 

amlXi + am2X2 + • • • + UmnXn — 



252 LINEAR FUNCTIONALS AND THE DUAL SPACE [CHAP. 11 

Here each row {an, oa, . . ., (kn) of the coefficient matrix A = (an) is viewed as an element 
of K" and each solution vector ^ = {xi, X2, . . ., Xn) is viewed as an element of the dual space. 
In this context, the solution space S of (*) is the annihilator of the rows of A and hence of 
the row space of A. Consequently, using Theorem 11.5, we again obtain the following 
fundamental result on the dimension of the solution space of a homogeneous system of 
linear equations: 

dimS = dimK" — dim (row space of A) = n - rank (A) 

TRANSPOSE OF A LINEAR MAPPING 

Let T :V -^ U be an arbitrary linear mapping from a vector space V into a vector space 
U. Now for any linear functional ^ G U*, the composition ^ o T is a linear mapping from 
V into K: 




That is, (j)oT GV*. Thus the correspondence 

</> h» <f>oT 

is a mapping from U* into V*; we denote it by T' and call it the transpose of T. In other 
words, T*:TJ* -^ V* is defined by 

r'(0) = 4,oT 

Thus {T\4,)){v) = ^{T{v)) for every v &V. 

Theorem 11.6: The transpose mapping T' defined above is linear. 

Proof. For any scalars a,b G K and any linear functionals <^, a G f/*, 

T*{a<j> + ba-) = (a^ + 6tr)or = a{ci>oT) + b(aoT) 

= a T\4,) + h T*{a) 
That is, T' is linear as claimed. 

We emphasize that if T is a linear mapping from V into U, then T* is a linear mapping 
from U* into V*: ^ j,, 

The name "transpose" for the mapping T* no doubt derives from the following theorem. 

Theorem 11.7: Let T-.V-^V be linear, and let A be the matrix representation of T rel- 
ative to bases {Vi} of V and {Ui} of V. Then the transpose matrix A* is 
the matrix representation of T*:U*-* V* relative to the bases dual to 
{Mi} and {Vi}. 



CHAP. 11] LINEAR FUNCTIONALS AND THE DUAL SPACE 253 

Solved Problems 

DUAL SPACES AND BASES 

11.1. Let : R2 -* R and a : R^ -* R be the linear functionals defined by ^{x, y) = x + 2y 
and (t{x, y) - Sx- y. Find (i) ^ + a, (ii) 4^, (iii) 2^ - 5<r. 

(i) (^ + »)(«, J/) = i>(x,y) + a(x,y) = x + 2y + 3x - y = 4x + y 

(ii) (40)(aj,2/) = 4 0(a;,2/) = i(x + 2y) = ix + 8y 

(iii) {2<p~5a){x,y) = 2<p{,x,y) - ha(x,y) = 2(a; + 2j/) - 5(3x - j/) = -13a; + 9j/ 

11.2. Consider the following basis of R^: {vi = (1,-1,3), Vi = (0,1,-1), Vs = (0,3,-2)}. 
Find the dual basis {^j, 4)^, ^g}. 

We seek linear functionals 

^i(«, y, «) = Oi* + azv + aaz, 4>^(x, y, z) = h-^x + h^y + bgz, 03(3;, j/, z) = e^x + CjW + CgZ 
such that 0i(i'i) = 1 0i(v2) = 01(^3) = 

02(i;i) = 02('y2) = 1 ?*2(''3) = 

03(^1) = 03(^2) = 03(1^3) = 1 

We And ^^ as follows: 

4>\ko\) = 0i(l. -1, 3) = «! - a2 + 303 = 1 

0l(^'2) = 0i(0, 1, -1) = fflj - tts = 

01(^3) = 0i(0, 3, -2) = 3a2 - 2a3 = 

Solving the system of equations, we obtain a,^ = 1, 02 = 0, 03 = 0. Thus 0i(a;, V, 2) = «. 

We next find 02- 

02(^1) = 02(1. -1. 3) = 61 - 62 + 363 = 

02(^2) = 02(0,1,-1) = 62- 63 = 1 

02(1^3) = 02(0,3,-2) = 362-263 = 

Solving the system, we obtain 61 = 7, h^— —2, 63 = —3. Hence 02(a', y, z) — 7x — 2y — 3«. 

Finally, we find 03: 

03(^1) = 03(1, -1, 3) = Ci - C2 + 3C3 = 

03('"2) = 03(0,1,-1) = C2- C3 = 

03(^3) = 03(0,3,-2) = 3c2-2c3 = 1 

Solving the system, we obtain Cj = —2, C2 = 1, Cg = 1. Thus 03(x, y, z) = —2x + y + z. 



11.3. Let V be the vector space of polynomials over R of degree — 1, i.e. V = 

{a + bt: a,b GR}. Let ^j:F-»R and ^2 = ^"*'' be defined by 

<^i(/(i)) = S^'mdt and Um) == S^'fi*)^^ 

(We remark that <j>^ and ^^ are linear and so belong to the dual space V*.) Find the 
basis {vi, t;2} of V which is dual to {<^j, ^g}. 

Let Vi = a + bt and i^a = c + dt. By definition of the dual basis, 

0i(^i) = 1, 02('yi) = and 0i(V2) = 0, 02(^2) = 1 
Thus 



0j(vj) = j (a+6t)df = a + ^b = 1 
02(''i) = I (a + 6t) dt = 2a + 26 = 



or a = 2, 6 = -2 



254 LINEAR FUNCTIONALS AND THE DUAL SPACE [CHAP. 11 






01(^2) = I {c + dt)dt - c + ^d - 
MV2) = ( {c + dt)dt - 2c + 2d - 1 



or c = —1, d = 1 




In other words, {2 — 2t, —^ + t} is the basis of V which is dual to {961, 02)- 

11.4. Prove Theorem 11.1: Suppose {vi, . . .,Vn} isa basis of V over K. Let <l>^, . . . , <j>^ & V* 
be the linear functionals defined by 

fl if i = i 
S.(v.) = 8.. ^ i 

Then {^j, . . .,^„} is a basis of V*. 

We first show that {^i, . . .,<i6„} spans V*. Let be an arbitrary element of V*, and suppose 

Set <r — ki<f>i + • • ■ + fc„0„. Then 

a(Vi) = (fci0i + • • • + fe„0„)(lJi) 

= fei 01(1^1) + k2<p2{Vi) + ••• + fe„0„(Vl) 

= fci • 1 + ^2 • + • • • + &„ • = fci 
Similarly, for i = 2, . . . , w, 

<T(Di) = (fci0i + • ■ ■ + /i:„0„)(i;i) 

= A;i0i('!;i) + ••• + ki^iivi) + •■• + fe„0n(-i'i) = ^i 

Thus <p{Vi) = a{Vi) for i = 1, . ..,n. Since and a agree on the basis vectors, = cr = 
&101 + • • • + fe„0„. Accordingly, {0i 0„} spans V*. 

It remains to be shown that {0i, . . . , J is linearly independent. Suppose 

ai0i + a202 + • • • + a„0„ = 

Applying both sides to v^, we obtain 

= 0(vj) = {aj0i + • ■ • + a„0n)(i'i) 

= ai0i(l'i) + ttz 02(^1) + ■•• + O'nSi'nC'yi) 

= »! • 1 + a2 • + ■ • • + a„ • = tti 
Similarly, for i = 2, . . . , «, 

= ©(-Ui) = (ai0i ^ + a„^„)(Vi) 

= a-l 0l('yt) + • • • + »i S6i(^i) + • • • + «n S^nC^'i) = "i 

That is, «! = 0, . . .,a.„ = 0. Hence {0i, ...,</>n} is linearly independent and so it is a basis of V*. 

11.5. Prove Theorem 11.2: Let {vi, . . . , v„} be a basis of V and let {<j>^, .. .,<f,Jhe the dual 
basis of V*. Then, for any vector uGV, 

and, for any linear functional a GV*, 

a = cr(i;>j + .7(1;,)^, + • • • + <7(i;J<^„ (2) 

Suppose M — aiVi + a^Vi + • • • + «„-«„ (5) 

Then 

0i(m) = di 0i(i;i) + ^2 01(^2) + • ■ • + «« SilC'^n) = «! • 1 + O2 • + • • • + «■« • == «1 



CHAP. 11] LINEAR FUNCTIONALS AND THE DUAL SPACE 255 

Similarly, for i = 2, . . .,n, 

</>i{u) = »! ,j,i(vi) + •■■ + Ui <f>i{Vi) + •■■ + a„ 0i(i;„) = »; 

That is, ipiiu) = Oj, 02(w) = a2' • • •. 0nW = «n- Substituting these results into (3), we obtain (1). 
Next we prove (2). Applying the linear functional a to both sides of (1), 
a{u) = 0i(M)<r(vi) + ^aM^K) + ••• + </>n(u) <r(v„) 
= o{Vi) ^i(w) + o(V2) 02(m) + • • • + a{vj <f,„{u) 

= {<'{Vi)'f>l + ('(^2)02 + • • • + <r(t)„)0„)(M) 

Since the above holds for every m G V, a — a{vi)<f,i + <r(i^2)02 + • • • + "{vj^^ as claimed. 

11.6. Prove Theorem 11.3: Let {vu...,Vn} and {wi,...,Wn} be bases of V and let 
{^1, • . . , ^„} and (<7j, . . . , CT„} be the bases of V* dual to {vi} and {Wt} respectively. 
Suppose P is the transition matrix from {Vi} to {Wi}. Then (P~»)' is the transition 
matrix from {(j>J to {<tJ. 

Suppose 

Wi = OuUi + ai2V2 + • • ■ + ai„i;„ <ri = ftn^i + 6i202 + • ■ ■ + 6i„0„ 

W2 = (121^1 + a22'«'2 + ■ ■ • + a2nVn 02 = 62101 + ^2202 + " " " + hn'f'n 



w„ = a„ii)i + a„2i'2 + • ■ • + a„„v„ a„ - b^i,pi + 6„202 + ' • • + 6„n0„ 

where P = («„) and Q = (6y). We seek to prove that Q = (P-i)«. 

Let Ri denote the tth row of Q and let Cj denote the ith column of P«. Then 
■Rj = (6ti. 6i2. • • • . 6i„) and Cj = (dji, aj2, . . . , a^^Y 
By definition of the dual basis, 

<fi(Wj) = (6(101 + 6i202 + • • • + 6j„^„)(ajii)i + aj2V2 + ■ • • + aj„v„) 
= 6jiaji + 6j2aj2 + ■ • • + 6i„a.j„ = Rfij = Sjj 
where Sy is the Kronecker delta. Thus 

/KiCi iJ,C2 ... K„C„\ 

QPt = ^2^1 R2C2 • • ■ R2C„ - !"■'■ •■■"1=7 

\RnCl Rvp2 ■ ■ ■ ^rflnl 

and hence Q = (P«)-i = (P-i)« as claimed. 



11.7. Suppose V has finite dimension. Show that if v GV, v¥'0, then there exists 
<j>GV* such that <^(v) # 0. 

We extend {v} to a basis {i), ^2. • • • > ^n) of V- By Theorem 6.1, there exists a unique linear 
mapping <f>:V ^ K such that 0(1;) - 1 and ^(i;^) = 0, i = 2, . . . , n. Hence ^ has the desired 
property. 



11.8. Prove Theorem 11.4: If V has finite dimension, then the mapping v ^ v is an 
isomorphism of V onto V**. (Here v : V* -* K is defined by v{<j>) = ^(v).) 

We first prove that the map v \-^ v is linear, i.e. for any vectors v,w eV and any scalars 
a,b & K, av + bw = av + bw. For any linear functional <fi G V*, 

av + bw (0) = <f,{av + bw) = a ^{v) + b (f>{w) 

= av(</,) + bw(4,) = (av + bw)(ip) 

Since av + bw (<i>) = (at) + 6M))(0) for every <p&V*, we have SaT+Vw = 0?+ 6w. Thus the 

At. 

map v 1-^ 1; is linear. 



256 LINEAR FUNCTIONALS AND THE DUAL SPACE [CHAP. 11 

Now suppose V GV, v ¥= 0. Then, by the preceding problem, there exists <f> G V* for which 
<f,{v) # 0. Hence v (<j,) - <t,(v) # and thus v ¥^ Q. Since v # implies v¥=Q, the map v H- r 
is nonsingular and hence an isomorphism (Theorem 6.5). 

Now dim V = dim V* = dim V** because V has finite dimension. Accordingly, mapping v ^ v 
is an isomorphism of V onto V**. 

ANNIHILATORS 

11.9. Show that if <j>GV* annihilates a subset S of V, then ^ annihilates the linear span 
L{S) of S. Hence S» = (L(S))». 

Suppose V e L{S). Then there exist Wj, . . .yW^G S for which v = a^w^ + a^w^ + • • • + a^w^. 

<f>{v) = Ui 0(Wi) + (12 0(W2) + ■ ■ ■ + «r ('('^r) = "l" + ^^20 + • • ' + afi = 

Since v was an arbitrary element of L(S), <f> annihilates L(S) as claimed. 

11.10. Let W be the subspace of R* spanned by vi = (1, 2, -3, 4) and v^ = (0, 1, 4, -1). Find 
a basis of the annihilator of W. 

By the preceding problem, it suffices to find a basis of the set of linear functionals <f>{x, y, z, w) = 
ax + hy + CZ + dw for which 4>{vi) = and <p(v2) — 0: 

0(1,2,-3,4) = a + 2& - 3c + 4d = 

0(0,1,4,-1) = 6 + 4c-d = 

The system of equations in unknowns a, h, c, d is in echelon form with free variables c and d. 

Set c = 1, d = to obtain the solution o = 11, 6 = -4, c = 1, d = and hence the linear func- 
tional 0i(a;, y, z, w) = 11a; — 4j/ + z. 

Set c = 0, d = -1 to obtain the solution a = 6, 6 = -1, c = 0, d = -1 and hence the linear func- 
tional 02(«^. 2/. «. *") = &x ~ y — w. 

The set of linear functionals {0i, 02} is a basis of W>, the annihilator of W. 

11.11. Show that: (i) for any subset S of V, S^S""; (ii) if S1CS2, then S^cS?. 

(i) Let V e S. Then for every linear functional e S*, v(0) = 0(v) = 0. Hence v G (SO)*. 
Therefore, under the identification of F and V**, v S S»o. Accordingly, S C S"*. 

(ii) Let 0GS2. Then (■!;) = for every ■« G Sg. But S1CS2; hence annihilates every ele- 
ment of Si, i.e. G Si. Therefore S^ cSj. 

11.12. Prove Theorem 11.5: Suppose V has finite dimension and W is & subspace of V. 
Then (i) dim W + dim W = dim V and (ii) W^ = W. 

(i) Suppose dim V = w and dim W = r - n. We want to show that dim W'> = n-r. We choose 
a basis {wi, ...,w^}oiW and extend it to the following basis of V: {wi, . . .,w„ Vi, . . ., v„_r}- 
Consider the dual basis , , 

\01i • • • > 0r> "it • • • > <'ti-r/ 

By definition of the dual basis, each of the above a's annihilates each Wj-, hence 

„! ff„_, G W*. We claim that {ctj} is a basis of W'". Now {<tj} is part of a basis of V* and 

so it is linearly independent. 

We next show that {a^} spans W. Let <r G W<>. By Theorem 11.2, 

<7 - a(Wi)0i 4- • • • + <T{Wr)</,r + a(Vi)ai + ••■ + a{v^-r)<'n-r 
- 001 4- • • • + 00, -I- tr(Vi)(Ti + • • • + CT(v„_r)<r„-r 
= <f(''l)<'l + • • • + aiv^-rW-r 

Thus {ffi ffn-r) spans PF" and so it is a basis of W^. Accordingly, dim W^" = m — r = 

dim V — dim W as required. 

(ii) Suppose dimV^^w and diraW = r. Then dim F* = m and, by (i), dimTF<' = TO-r. Thus 
by (i), dim TVO" = n- (n-r) = r; therefore dim W = dim W". By the preceding problem, 
W C TFOO. Accordingly, W = WO". 



CHAP. 11] LINEAR PUNCTIONALS AND THE DUAL SPACE 257 

11.13. Let U and W be subspaces of V. Prove: (C7 + Wf = U° n W^. 

Let 0G {U+W)0. Then annihilates U+W and so, in particular, ,p annihilates U and y. 
That is, e t/o and G TF"; hence £ C/o n PFO. Thus {U + W)" C U" n W. 

On the other hand, suppose a e W n W. Then a annihilates U and also W. If ve^U+W, 
then -y = M + w where m £ [7 and w e W. Hence ctCv) = (r{u) + a{w) = + = 0. Thus a 
annihilates U+W, i.e. a e. (U + TF)". Accordingly, U'>+W'>c(U+ W)'>. 

Both inclusion relations give us the desired equality. 

Remark: Observe that no dimension argument is employed in the proof; hence the result holds 
for spaces of finite or infinite dimension. 



TRANSPOSE OF A LINEAR MAPPING 

11.14. Let <j> be the linear functional on R^ defined by ^(a;, y) - x - 2y. For each of the 

following linear operators T on R^, find iT%4,)){x, y): (i) T{x, y) = {x, 0); (ii) T{x,y) = 

(y,x + y); (iii) T{x,y) = (2x~Zy,hx + 2y). 

By definition of the transpose mapping, rf(0) = .pof, i.e. (Tt{4,)){v) = <p{T(v)) for every 
vector V. Hence 

(i) {Tt{,p)){x, y) = </,(T{x,y)) = 0(x, 0) = x 

(ii) (rt(0))(x, 2/) = 4>{T{x,y)) = <f>{y,x + y) = y - 2{x + y) = -2x - y 

(iii) {Tt(<f,))(x, y) = <f.(T(x, y)) = ,f,(2x - 3y, 5x + 2y) = i2x - 3y) - 2i5x + 2y) = ~8x-ly. 



11.15. Let T-.V-^U be linear and let T*:U*^ V* be its transpose. Show that the kernel 
of T' is the annihilator of the image of T, i.e. Ker T* = (Im T)". 

Suppose e Ker T*; that is, r'(0) = o y = o. If m G Im T, then m = T(v) for some 
•y G V; hence 

0(m) = 0(7(1;)) = (4>°T){v) = 0{v) = 

We have that <p{u) = for every m G Im T; hence G (Im 2^». Thus Ker T* C (Im r)*. 

On the other hand, suppose o G (Im T)"; that is, ^(Im T) = {0}. Then, for every v &V, 
(THamv) = {aoT)(v) = a(T{v)) = = 0(v) 

We have that (r'(a))('y) = Q(v) for every ■!; G Y; hence T*(a) = 0. Therefore a S Ker T* and so 
(Im r)o c Ker TK 

Both inclusion relations give us the required equality. 



11.16. Suppose V and U have finite dimension and suppose T:V ^ U is linear. Prove: 
rank(r) = rank(rO- 

Suppose dim V = n and dim U = m. Also suppose rank (T) = r. Then, by Theorem 11.5, 
dim ((Im T)0) = dim 17 - dim (Im T) = m - rank (T) = m - r 

By the preceding problem, Ker Tt = (Im T)'>. Hence nullity (T') = m — r. It then follows that, as 
claimed, 

rank(r«) = dim C/* - nullity (T') = m ~ (m-r) = r = rank (T) 



11.17. Prove Theorem 11.7: Let T:V -^ U be linear and let A be the matrix representation 
of T relative to bases {vi, . . ., Vm} of V and {ui, . . . , it„} of U. Then the transpose 
matrix A* is the matrix representation of T:JJ* ^ V* relative to the bases dual to 

[Ui] and {Vj}. 

Suppose ^(^i) = a-iiU-i + a^^u-i + • ■ • + ain^n 

T{V2) = a2iUi + a22M2 + • • • + a2nU„ .^. 



258 LINEAR FUNCTIONALS AND THE DUAL SPACE [CHAP. 11 

We want to prove that 



r'(o'2) = «1201 + <*2202 + • • ■ + ttm2'^m 



{2) 



r'(of„) — a-in't'l + '''•2ti02 + • • • + 0'mn</>m 

where {(tJ and {^j} are the bases dual to {mJ and {vj} respectively. 

Let V e.V and suppose v = k^v^ + fejVa + • • ■ + fc„v„. Then, by (1), 
T(v) = fci r(-i;i) + ^2 r(i'2) + • • ■ + km TivJ 

= &i(aiiMi + ■ • • + aj„M„) + fc2(a2lMl + • • • + «2n**n) + • • • + fcm(a'mlMl + ■ • ' + C-mn'^'n) 

= (fciaii + fc2a21 + • • • + fcmaml)^! + ' • • + (fcitlln + fc2*2n + ' " ' + k„a'mn)'>^n 

n 

= 2 (fciOii + fc2«2i + • • • + kmOmdUi 
i=l 

Hence for ;/ = 1, . . . , n, 

(Tt(aj)(v)) = ffjCrCv)) = 0-j ( 2 (feiaii + fc2(l2i + 1- fcm«mi)«t ) 

= fciOij + fe2a.2j + • • ■ + k^amj (^) 

On the other hand, for j = 1, . . .,n, 

(aij0i + a2j02 + • • • + a^j^J(v) = (ajj^i + a2j4'2 +■■■+ amj0m)(fei'^i + A:2'U2 + • " ' + km^m) 

= k^aij + k2a2j + • • • + k^a^j {i) 

Since v e.V was arbitrary, (3) and (4) imply that 

r'(CTj) = aij0i + a2j*2 + • • • + a.mj't'm' i = 1, . . ., m 
which is (2). Thus the theorem is proved. 

11.18. Let A be an arbitrary mxn matrix over a field K. Prove that the row rank and the 
column rank of A are equal. 

Let T:K«-> K'n be the linear map defined by T{v) = Av, where the elements of X" and K^ 
are written as column vectors. Then A is the matrix representation of T relative to the usual bases 
of if" and K", and the image of T is the column space of A. Hence 

rank (T) = column rank of A 
By Theorem 11.7, A« is the matrix representation of T* relative to the dual bases. Hence 

rank (T') = column rank of A* = row rank of A 
But by Problem 11.16, rank(r) = rank(r'); hence the row rank and the column rank of A are 
equal. (This result was stated earlier as Theorem 5.9, page 90, and was proved in a direct way 
in Problem 5.21.) 



Supplementary Problems 

DUAL SPACES AND DUAL BASES 

11.19. Let : R3 -» R and <t : R3 -> R be the linear functionals defined by ^(x, y,z) = 2x-By + z and 
a(x, y, z) = Ax-2y + 3z. Find (i) <!> + <', (ii) 3^, (iii) 20 - 5<r. 

11.20. Let be the linear functional on R2 defined by 0(2,1) = 15 and 0(1,-2) = -10. Find ^(x,y) and, 
in particular, find 0(— 2, 7). 

11.21. Find the dual basis of each of the following bases of R^: 

(i) {(1, 0, 0), (0, 1, 0), (0, 0, 1)}, (ii) {(1, -2, 3), (1, -1, 1), (2, -4, 7)}. 



CHAP. 11] LINEAR FUNCTIONALS AND THE DUAL SPACE 259 

11.22. Let V be the vector space of polynomials over B of degree - 2. Let 0i, <f>2 and ^3 be the linear 
functionals on V defined by 

Him) = f mat, <j>^{m) = m), .p^mm = m) 

Here f{t) = a+bt + ct^eV and /'(f) denotes the derivative of f(t). Find the basis {/i(«), fzd), ^(t)} 
of V which is dual to {^i, <f>2, 03}. 

11.23. Suppose u,vGV and that <f>{u) = implies <p{v) = for all <f, G V*. Show that v = ku for 
some scalar k. 

11.24. Suppose 0,ffGy* and that <f>[v) = implies <t(v) = for all vGV. Show that a = k^, for 
some scalar k. 

11.25. Let V be the vector space of polynomials over K. For a^K, define <pa'-V ^ K by 0a(/(*)) = /('')• 
Show that: (i) 0„ is linear; (ii) if a ¥= b, then 0„ 7^ 0b- 

11.26. Let V be the vector space of polynomials of degree — 2. Let a,b,c€.K be distinct scalars. Let 
0a, 0b and 0e be the linear functionals defined by 0a (/(«)) = /(a), 0b (/(«)) = /(6), 0c (/(*)) = /(c). Show 
that {0a, 0b, 0c} is linearly independent, and find the basis {/i(t), /2(t), fait)} of V which is its dual. 

11.27. Let V be the vector space of square matrices of order n. Let T iV ^ K be the trace mapping: 
T{A) = ail + «22 + ■ ■ ■ + "rm> where A = (o.^). Show that T is linear. 

11.28. Let W he a, subspace of V. For any linear functional on W, show that there is a linear functional 
a on V such that a(w) = <p(w) for any w &W, i.e. is the restriction of a to TF. 

11.29. Let {ci, . . . , e„} be the usual basis of K". Show that the dual basis is Wi, . . ., ir„} where vi is the 
•ith projection mapping: viia^, . . .,a„) = a.;. 

11.30. Let y be a vector space over B. Let 0j, 02 S V* and suppose <r : y -> B defined by a(v) = 01(1;) 02('y) 
also belongs to V*. Show that either 0i = or 02 = 0. 

ANNIHILATORS 

11.31. Let W be the subspace of B* spanned by (1,2,-3,4), (1,3,-2,6) and (1,4,-1,8). Find a basis of 
the annihilator of W. 

11.32. Let W be the subspace of B^ spanned by (1, 1,0) and (0, 1, 1). Find a basis of the annihilator of W. 

11.33. Show that, for any subset S of V, L{S) = S"" where L{S) is the linear span of S. 

11.34. Let U and W be subspaces of a vector space V of finite dimension. Prove: {U n W)" = U" + W. 

11.35. Suppose V - U @ W. Prove that V* = W ® WO. 

TRANSPOSE OF A LINEAR MAPPING 

11.36. Let be the linear functional on B^ defined by <l>(x,y) = Zx — 2y. For each linear mapping 
r : B3 ^ B2, find (rK0))(a;, y, z): 

(i) T{x,y,z) = (x + y,y + z); (ii) T(x,y,z) = (x + y + z,2x-y). 

11.37. Suppose S:U-^V and TiV^W are linear. Prove that (ToS)t = StoTK 

11.38. Suppose T:V -*U is linear and V has finite dimension. Prove that Im T' = (Ker T)". 

11.39. Suppose r : y -» [7 is linear and u G U. Prove that u GlmT or there exists •/> GV* such that 
rt(0) = and 0(m) = 1. 

11.40. Let y be of finite dimension. Show that the mapping T h> Tt is an isomorphism from Hom (V, V) 
onto Hom {V*, V*). (Here T is any linear operator on V.) 



260 LINEAR PUNCTIONALS AND THE DUAL SPACE [CHAP. 11 

MISCELLANEOUS PROBLEMS 

11.41. Let y be a vector space over R. The line segment uv joining points m, v S V is defined by 
wv — {tu + (1 — t)v: a — t — 1). A subset S of y is termed convex if u,v GS implies uvcS. 
Let <6 e y* and let 

W+ = {vGV: 4,(v) > 0}, W = {vGV: <t>(v) = 0}, W" = {vi=,V: <t>(v) < 0} 

Prove that W + , W and W~ are convex. 

11.42. Let y be a vector space of finite dimension. A hyperplane i/ of y is defined to be the kernel of a 
nonzero linear functional </> on V. Show that every subspace of V is the intersection of a finite 
number of hyperplanes. 



Answers to Supplementary Problems 

11.19. (i) 6x-5y + 4z, (ii) ex-9y + Sz, (iii) -16x + iy - ISz 

11.20. <p(x, y) = Ax + ly, 0(-2, 7) = 41 

11.21. (i) {^i{x,y,z) = X, <t,2(x,y,z) - y, <t>z{x,y,z) = z} 

(ii) {>t,i(x, y, z) = -3x -hy- 2z, ^ii'^, y, z) = 2x + y, </>3(x, y,z) = x + 2y + z] 

11.25. (ii) Let f{t) = t. Then ^a(/(*)) = a ^ 6 = 0b(/(O). and therefore ^„ # 0b- 

11.26. |/i(«) - („_6)(„_e) . h(t) - (6_„)(5_^) . /3W (^_„)(,_6) I 

11.31. {0i(a;, 2/, z, t) = 5x - 2/ + 2, 02(«. 1/. «. *) = 22/ - t} 

11.32. {<j>(x, y,z) = x-y + z) 

11.36. (i) (Tt(<t>))(x,y,z)=Zx + y-2z, (ii) (r'(0))(a;,2/,z) = -x + Sj/ + 3x. 



chapter 12 



Bilinear, Quadratic and Hermitian Forms 

BILINEAR FORMS 

Let F be a vector space of finite dimension over a field K. A bilinear form on V is a 
mapping f.VxV^K which satisfies 

(i) f{aui + bu2, v) = af{ui, v) + bf{u2, v) 

(ii) f{u, avi + bvi) = af{u, Vi) + bf{u, Vi) 

for all a,b&K and all im, Vi e V. We express condition (i) by saying / is linear in the 
first variable, and condition (ii) by saying / is linear in the second variable. 



Example 12.1: 



Example 12.2: 



Example 12.3: 



Let <j> and a be arbitrary linear f unctionals on V. Let / : V X y -» K be defined by 
f(u,v) = 4>(u)<t(v). Then / is bilinear because (p and a are each linear. (Such a 
bilinear form / turns out to be the "tensor product" of <f> and <t and so is sometimes 
written / = ^ (g) a.) 

Let / be the dot product on R"; that is, 

f{u, v) = u'v = a^bi + a^h^ + • • • + a„6„ 
where u = (a^) and v = (h^). Then / is a bilinear form on R". 

Let A = (ffly) be any nXn matrix over K. Then A may be viewed as a bilinear 
form / on X" by defining 

' an ai2 ... a.i„ \ I Vi 

a^i 0.22 • ■ ■ "2n I I 2/2 



f(X,Y) = XtAY = {xi,X2 a; J 

*nl °'n2 
«ll«l2/l + «12»'l2/2 + 



= 2 



U- 



^ij^iVj 



Vnl 



The above formal expression in variables Xi,yi is termed the bilinear polynomial 
corresponding to the matrix A. Formula (i) below shows that, in a certain sense, 
every bilinear form is of this type. 

We will let B{V) denote the set of bilinear forms on V. A vector space structure is 
placed on B(V) by defining f + g and kf by: 

{f + 9){u,v) = f{u,v) + g{u,v) 

{kf){u,v) = kf{u,v) 

for any f,g & B{V) and any kE^K. In fact, 

Theorem 12.1: Let F be a vector space of dimension n over K. Let {4,^, ...,</>„} be a 
basis of the dual space V*. Then {fij:i,j = 1,. • .,w} is a basis of B{V) 
where fa is defined by fi}{u,v) = <j>^{u) <i>.{v). Thus, in particular, 
dimB{V) = w'. 



261 



262 BILINEAR, QUADRATIC AND HERMITIAN FORMS [CHAP. 12 

BILINEAR FORMS AND MATRICES 

Let / be a bilinear form on V, and let {ei, . . . , e^} be a basis of V. Suppose u,v eV 

and suppose 

u = aiCi + • • • + a„e„, v = biCi + • • • + bnen 

Then 

f{u, v) — f{aiei + • • • + anCn, biCi + • • • + bne„) 

n 

= ai&i/(ei, ei) + 0162/(61,62) + ••• + a„b„/(en, e^) = ^ aib}f{ei,ej) 

i,3 = 1 

Thus / is completely determined by the n^ values /(ei, e,). 

The matrix A = {an) where an — f{ei, e,) is called the matrix representation of f rel- 
ative to the basis {ei} or, simply, the matrix of f in {ei}. It "represents" / in the sense that 

'&i^ 
f{u,v) = ^ aibjfiei, ej) = (ai, . . . , a„)A | ^| = Me^Me (1) 

^bnj 

for all u,v GV. (As usual, [u]e denotes the coordinate (column) vector of u GV in the 
basis {ei}.) 

We next ask, how does a matrix representing a bilinear form transform when a new 
basis is selected ? The answer is given in the following theorem. (Recall Theorem 7.4 that 
the transition matrix P from one basis {e^} to another {el} has the property that [u]e = P[u]e' 
for every u G V.) 

Theorem 12.2: Let P be the transition matrix from one basis to another. If A is the 
matrix of / in the original basis, then 

B = P'AP 

is the matrix of / in the new basis. 

The above theorem motivates the following definition. 

Definition: A matrix B is said to be congruent to a matrix A if there exists an invertible 
(or: nonsingular) matrix P such that B — P^AP. 

Thus by the above theorem matrices representing the same bilinear form are congruent. 
We remark that congruent matrices have the same rank because P and P' are nonsingular; 
hence the following definition is well defined. 

Definition: The rank of a bilinear form / on V, written rank(/), is defined to be the rank 
of any matrix representation. We say that / is degenerate or nondegenerate 
according as to whether rank (/) < dim V or rank (/) = dim V. 



ALTERNATING BILINEAR FORMS 

A bilinear form / on 7 is said to be alternating if 

(i) f(v,v) = 

for every v GV. If / is alternating, then 

= f{u + v,u + v) = f{u, u) + f{u, v) + f{v, u) + f{v, v) 

and so (ii) f{u,v) = -f{v,u) 



CHAP. 12] BILINEAR, QUADRATIC AND HERMITIAN FORMS 263 

for every u,v GV. A bilinear form which satisfies condition (ii) is said to be skew sym- 
metric (or: anti-symmetric). If 1 + 1 v^ in K, then condition (ii) implies f{v, v) = — f{v, v) 
which implies condition (i). In other words, alternating and skew symmetric are equivalent 
when 1 + 1 9^ 0. 

The main structure theorem of alternating bilinear forms follows. 

Theorem 12.3: Let / be an alternating bilinear form on V. Then there exists a basis of 
V in which / is represented by a matrix of the form 

Oil 

-1_04 ^ 

I l1 



LzLjPJ, 



I 1 j 

1-1 I 

-T-«4 



lOj 



ro- 



Moreover, the number of ( ,, ] is uniquely determined by / (because 



it is equal to ^ rank (/)). 



-1 



In particular, the above theorem shows that an alternating bilinear form must have 
even rank. 



SYMMETRIC BILINEAR FORMS, QUADRATIC FORMS 

A bilinear form / on V is said to be symmetric if 

f{u,v) = f{v,u) 
for every u,v €.V. If A is a matrix representation of /, we can write 

f{X,Y) = X'AY = {X'AYY = Y'A'^X 

(We use the fact that X*AY is a scalar and therefore equals its transpose.) Thus if / is 

symmetric, 

Y*A*X = f(X,Y) = f{Y,X) =. Y'AX 

and since this is true for all vectors X, Y it follows that A = A* or A is sjTnmetric. Con- 
versely if A is symmetric, then / is symmetric. 

The main result for symmetric bilinear forms is given in 

Theorem 12.4: Let / be a symmetric bilinear form on V over K (in which 1 + 1^0). 
Then V has a basis {vi, . . .,v„} in which / is represented by a diagonal 
matrix, i.e. f{vi, Vj) = for i ¥- j. 

Alternate Form of Theorem 12.4: Let A be a symmetric matrix over X (in which 1 + 1 ^^^O). 
Then there exists an invertible (or: nonsingular) matrix P such that P*AP 
is diagonal. That is, A is congruent to a diagonal matrix. 



264 



BILINEAR, QUADRATIC AND HERMITIAN FORMS 



[CHAP. 12 



Since an invertible matrix P is a product of elementary matrices (Problem 3.36), one way 
of obtaining the diagonal form P*AP is by a sequence of elementary row operations and 
the same sequence of elementary column operations. These same elementary row opera- 
tions on / will yield PK This method is illustrated in the next example. 



Example 12.4: Let A - 

matrix (A, I) 



1 2-3 

2 5 — 4 |, a symmetric matrix. It is convenient to form the block 
-3 -4 8 



(A.I) 



1 


2 


-3 


1 








2 


5 


-4 





1 





3 


-4 


8 








1 



We apply the operations R^ -» — 2/Ji + R^ and R^ -^ SR^ + R3 to {A, I), and then 
the corresponding operations C^ -^ — 2Ci + C^ and Cg -* 3Ci + C3 to A to obtain 



1 


2 -3 


1 











1 2 


-2 


1 








2 -1 


3 





1 



and then 



We next apply the operation ^3 -> —2R2 + R3 and then the corresponding operation 
C3 -♦ — 2C2 + C3 to obtain 



/l 










1 





1 


2 


-2 


1 





-5 


7 


-2 



and then 






1 


1 





1 


0! 


-2 


1 





-5 1 


7 


-2 



Now A has been diagonalized. We set 



P - 




and then Pt A P 




Definition: A mapping q:V-*K is called a quadratic form if q{v) = f{v,v) for some 
symmetric bilinear form / on V. 

We call q the quadratic form associated with the symmetric bilinear form /. If 
1 + 1 7^ in if, then / is obtainable from q according to the identity 

f(u,v) = Uq{u + v) - q{u) - q{v)) 
The above formula is called the polar form of /. 

Now if / is represented by a symmetric matrix A = (an), then q is represented in the 
form 

(an ai2 ... ain\ /xi\ 
a21 ^22 ... azn \l X2 
... 
ttnl a„2 ... annj \Xnj 

= y ttiiXiXj - aiiccf + 022X2 + • • • + annxl +22 atiXiXj 
u *<^ 

The above formal expression in variables Xi is termed the quadratic polynomial correspond- 
ing to the symmetric matrix A. Observe that if the matrix A is diagonal, then q has the 
diagonal representation 

q{X) - X*AX - anxl + a22xl + ••• + annxl 
that is, the quadratic polynomial representing q will contain no "cross product" terms. By 
Theorem 12.4, every quadratic form has such a representation (when 1 + 1^0). 



CHAP. 12] BILINEAR, QUADRATIC AND HERMITIAN FORMS 265 

Example 12.5: Consider the following quadratic form on R^: 

<l(^> v) = 2x2 _ \2,xy + 5j/2 

One way of diagonalizing q is by the method known as "completing the square" 
which is fully described in Problem 12.35. In this case, we make the substitution 
X = s + Zt, y = t to obtain the diagonal form 

q(x, y) = 2(s + 3t)2 - 12(s + U)t + 5(2 - 2s^ - 13*2 

REAL SYMMETRIC BILINEAR FORMS. LAW OF INERTIA 

In this section we treat symmetric bilinear forms and quadratic forms on vector 
spaces over the real field R. These forms appear in many branches of mathematics and 
physics. The special nature of R permits an independent theory. The main result follows. 

Theorem 12.5: Let / be a symmetric bilinear form on V over R. Then there is a basis of 
V in which / is represented by a diagonal matrix; every other diagonal 
representation has the same number P of positive entries and the same 
number N of negative entries. The difference S = P — N is called the 
signature of /. 

A real symmetric bilinear form / is said to be nonnegative semidefinite if 

q{v) = f{v,v) ^ 

for every vector v; and is said to be positive definite if 

q{v) = f{v,v) > 

for every vector v ¥=0. By the above theorem, 

(i) / is nonnegative semidefinite if and only if S — rank (/) 
(ii) / is positive definite if and only if S = dim V 

where <S is the signature of /. 

Example 12.6: Let / be the dot product on R"; that is, 

f(u, v) = w V = eii6i + a2&2 + • • " + "n^n 
where u = (ttj) and v — (6j). Note that / is symmetric since 

f(u, v) = W V = vu = f{v, u) 
Furthermore, / is positive definite because 

f(u, u) = a^+ al+ ■■■ + al > 
when M v^ 0. 

In the next chapter we will see how a real quadratic form q transforms when the transi- 
tion matrix P is "orthogonal". If no condition is placed on P, then q can be represented 
in diagonal form with only I's and — I's as nonzero coefficients. Specifically, 

Corollary 12.6: Any real quadratic form q has a unique representation in the form 



qytVif . . . , iXyn) — vCi I ' * * ~T' i^g <jO 



2 
s + 1 



The above result for real quadratic forms is sometimes referred to as the Law of Inertia 
or Sylvester's Theorem. 



266 BILINEAR, QUADRATIC AND HERMITIAN FORMS [CHAP. 12 

HERMITIAN FORMS 

Let y be a vector space of finite dimension over the complex field C. Let / : V xV -* C 

be such that 

(i) f{aui + bu2,v) = af{ui,v) + bf{u2,v) 



(ii) f{u,v) ^ fiv,u) 

where a, 6 G C and im, v &V. Then / is called a Hermitian form on V. (As usual, k 
denotes the complex conjugate of A; G C.) By (i) and (ii), 



f{u, avi + bv2) = f{avi + bv2, u) — af{vi, u) + b f{v2, u) 

= df{vi,u) + bf{v2,u) = df{u,Vi) + bf{u,V2) 

That is, (iii) f{u, avi + bv2) - a f{u, Vi) + b f{u, Vi) 

As before, we express condition (i) by saying / is linear in the first variable. On the other 
hand, we express cond ition ( iii) by saying / is conjugate linear in the second variable. Note 
that, by (ii), f{v, v) = f{v, v) and so f{v, v) is real for every v GV. 

Example 12.7: Let A = (dy) be an Ji X w matrix over C. We write A for the matrix obtained by 
taking the complex conjugate of every entry of A, that is, A = (a^). We also write 
A* for A« = AJ. The matrix A is said to be Hermitian if A* = A, i.e. if ay = a]^. 
If A is Hermitian, then f{X, Y) ^X'^AY defines a Hermitian form on C" (Prob- 
lem 12.16). 

The mapping q:V->-B, defined by q{v) = f{v, v) is called the Hermitian quadratic form 
or complex quadratic form associated with the Hermitian form /. We can obtain / from 
q according to the following identity called the polar form of /: 

f{u, v) = l{q{u + v)- q{u - v)) + j(q{u + iv) - q{u - iv)) 

Now suppose {ei, . . . , e„} is a basis of V. The matrix H = (fo«) wher e feij = /(ei, e,) is 
called the matrix representation of / in the basis {ej}. By (ii), f{ei,ej) = f(ej,ei); hence H 
is Hermitian and, in particular, the diagonal entries of H are real. Thus any diagonal rep- 
resentation of / contains only real entries. The next theorem is the complex analog of 
Theorem 12.5 on real symmetric bilinear forms. 

Theorem 12.7: Let / be a Hermitian form on V. Then there exists a basis {ei, . . ., en} of 
V in which / is represented by a diagonal matrix, i.e. f{ei, ei) = for 
i ¥= j. Moreover, every diagonal representation of / has the same number 
P of positive entries, and the same number N of negative entries. The 
difference S = P-N is called the signature of /. 

Analogously, a Hermitian form / is said to be nonnegative semidefinite if 

q{v) = f{v,v) ^ 
for every v eV, and is said to be positive definite if 

q{v) = f{v,v) > 
for every v ¥=0. 

Example 12.8: Let / be the dot product on C"; that is, 

f(u, V) — U'V = ZiWi + Z2W2 + ■ ■ • + z„w„ 

where u = («{) and v = (Wj). Then / is a Hermitian form on C". Moreover, / is 
positive definite since, for any v # 0, 

f(u, u) = zi«i + Z2,Z2+ •■■ + z„z„ = l^iP + I22P + • • • + K\^ > 



CHAP. 12] BILINEAR, QUADRATIC AND HERMITIAN FORMS 267 

Solved Problems 

BILINEAR FORMS 

12.1. Let u = {xi, X2, xs) and v = (t/i, 1/2, ya), and let 

f{u, v) = Sxiyi — 2xiyi + 5x22/1 + 7*21/2 - 8a;2i/3 + 4x33/2 - Xzys 
Express / in matrix notation. 

Let A be the 3X3 matrix whose i;-entry is the coefficient of XiVy Then 

/3 -2 0\/j/i^ 
f{u,v) = XtAY = (^i.xa.ajg) 5 7-8 : 

\0 4 -l/\2/3i 

12.2. Let A be an « X « matrix over K. Show that the following mapping / is a bilinear 
form on K": f{X,Y) = X*AY. 

For any a,bGK and any Xj, Fj e K", 

/(aATi + 6Z2, F) = (aZi + 6X2)*^ Y = (aXj + 6X^) .4 F 

= aXlAY + bXlAY = a/(Zi, F) + 6/(X2, F) 

Hence / is linear in the first variable. Also, 

/(X, aFi + ftFa) = XtA(aFi + 6F2) = aX^AYi + bX^AY^ = a /(X, Fj) + 6 /(X, F2) 

Hence / is linear in the second variable, and so / is a bilinear form on K". 

12.3. Let / be the bilinear form on R^ defined by 

f{{xu X2), (yi, 2/2)) = 2xi2/i - 3xi2/2 + X22/2 

(i) Find the matrix A of / in the basis {Ui — (1,0), 112 = (1, 1)}. 
(ii) Find the matrix B of / in the basis {vi = (2, 1), V2 = (1, -1)}. 
(iii) Find the transition matrix P from the basis {mi} to the basis {Vi}, and verify 
that B^P*AP. 

(i) Set A = (ay) where ay = /(ttj, Uj): 

an = f(ui,ui) = /((1,0), (1,0)) =2-0 + 0= 2 

ai2 = /(mi,M2) = /((1,0), (1,1)) = 2-3 + = -1 

021 = /(M2.M1) = /((1,1), (1,0)) = 2-0 + 0= 2 

022 = /(M2.W2) = /((l.l), (1,1)) = 2-3 + 1 = 



=(r;)' 



Thus A = ( ft / *^ *^® matrix of / in the basis {u^, n^}. 

(ii) Set B = (6y) where 6y = /(Vj,-!;,): 

611 = /(^i.i'i) = /((2,1), (2,1)) = 8-6 + 1 = 3 

612 = /(^i.'y2) = /((2.1), (1,-1)) = 4 + 6-1 = 9 

621 = f(->'2,'>'i) = /((l.-l), (2,1)) = 4-3-1 = 

622 = /(^2.^2) = /((I, -1), (1,-1)) = 2 + 3 + 1 = 6 

/3 9\ 
Thus B = I ft / 's ^^^ matrix of / in the basis {vi, v^}. 

(iii) We must write Vi and V2 in terms of the Mj: 

ri = (2,1) = (1,0) + (1,1) = M1 + M2 
V2 = (1,-1) = 2(1,0) -(1,1) = 2Mi-M2 



268 BILINEAR, QUADRATIC AND HERMITIAN FORMS [CHAP. 12 

Then ^ = ( J _j) and so P' = Q _M . Thus 



12.4. Prove Theorem 12.1: Let F be a vector space of dimension n over K. Let {^j, . . . , ^„} 
be a basis of the dual space V*. Then {/«: ij" = 1,. . .,%} is a basis of B{V) where 
/« is defined by f^.{u,v) = ^.(m)^.(v). Thus, in particular, dimB(F) = n\ 

Let {e^, . . ., e„} be the basis of V dual to {^J. We first show that {/„} spans B(y). Let / G B(y) 
and suppose /(ej, e^) = ay. We claim that / = ^ay/y. It suffices to show that /(e^.e,) = 
(2aij/ij)(es, ej) for s,t = l,...,n. We have 

(2 ay/y)(es, et) = 2ay/«(es, ej) = 2 Oy ^i(es) ^^(et) 

= 2ay«isSjt = "st = f(es,et) 
as required. Hence {/y} spans B{V). 

It remains to show that {/y} is linearly independent. Suppose Soy/y = 0. Then for 
s,t = 1,. . .,n, 

= 0(es, et) = (2 ao/y)(es, Bf) = a^s 

The last step follows as above. Thus {/y} is independent and hence is a basis of B(V). 



12.5. Let [/] denote the matrix representation of a bilinear form / on F relative to a basis 
{ei, . . .,e„) of V. Show that the mapping / i-» [/] is an isomorphism of B{V) onto 
the vector space of w-square matrices. 

Since / is completely determined by the scalars f(e^, ej), the mapping /*"*[/] is one-to-one and 
onto. It suffices to show that the mapping / l-> [/] is a homomorphism; that is, that 

[af+bg] = a[f] + b[g] (*) 

However, for i,j = 1,. . .,n, 

(af + bg){ei, e^) = «/(«{, e^) + bg(ei, Bj) 

which is a restatement of (*). Thus the result is proved. 



12.6. Prove Theorem 12.2: Let P be the transition matrix from one basis {e,} to another 
basis {gj}. If A is the matrix of / in the original basis {Ci}, then B = P'^AP is the 
matrix of / in the new basis {e\}. 

Let u,v eV. Since P is the transition matrix from {e^} to {e,-}, we have P[u]g, — [u]g and 
P[v]e- = [v]e- hence [u]l = [u]l, PK Thus 

f{u,v) = [u]lA[v], = [u]l.PtAP[v],. 

Since u and v are arbitrary elements of V, P^ A P is the matrix of / in the basis {e^}. 



SYMMETRIC BILINEAR FORMS. QUADRATIC FORMS 

12.7. Find the symmetric matrix which corresponds to each of the following quadratic 
polynomials: 

(i) q{x, y) = 4x^ - 6xy - 7y^ (iii) q{x, y, z) = 3x^ + 4xy - y' + 8xz — Qyz + z^ 

(ii) q{x, y) - xy + y^ (iv) q(x, y, z) = x^ - 2yz + xz 

The symmetric matrix A — (uy) representing q(xi, . . .,x„) has the diagonal entry a^ equal to 
the coefficient of xf and has the entries ay and ajj each equal to half the coefficient of ajjajj-. Thus 



CHAP. 12] 



BILINEAR, QUADRATIC AND HERMITIAN FORMS 



269 



(ii) 





12.8. For each of the following real symmetric matrices A, find a nonsingular matrix P 
such that P* AP is diagonal and also find its signature: 



(i) A - -: 




(ii) A = 




(i) First form the block matrix (A, I): 



{A, I) = - 





Apply the row operations R2^^Ri + R2 and iZa -» — 2i2i + iJj to (A,/) and then the corre- 
sponding column operations C^ -> SCj + C^, and C3 -» — 2Ci + C3 to A to obtain 



1 


-3 


2 


1 





0\ 




/I 








1 











-2 


1 


3 


1 





and then 





-2 


1 


3 


1 








1 


4 


-2 





1/ 




\o 


1 


4 


-2 





1 



Next apply the row operation B3 -» i?2 + ^^3 ^^^ then the corresponding column operation 
C3-* C2 + 2C3 to obtain 



and then 







then P^AP - 




Now A has been diagonalized. Set P 

The signature S of A is 5 = 2-1 = 1 
(ii) First form the block matrix (A, I): 

(A,/) = 



In order to bring the nonzero diagonal entry —1 into the first diagonal position, apply the row 
operation Ri <-> R^ and then the corresponding column operation Ci «-> C3 to obtain 






1 


1 


1 








1 


-2 


2 





1 





1 


2 


-1 








1 



1 


2 


-1 








1\ 




-1 


2 


1 








1 


1 


-2 


2 





1 





and then 


2 


-2 


1 





1 








1 


1 


1 





0/ 




\ 1 


1 





1 









Apply the row operations i?2 ~* 2Bi + -^2 ^nd JB3 -> iJj + R^ and then the corresponding 
column operations C^ -* 2Ci + C^ and C3 ^ Ci + C3 to obtain 



1 


2 


1 








1\ 




/-I 














1 





2 


3 





1 


2 


and then 





2 


3 





1 


2 





3 


1 


1 





1/ 




\ 


3 


1 


1 





1 




270 BILINEAR, QUADRATIC AND HERMITIAN FORMS [CHAP. 12 

Apply the row operation iJg -> —3^2 + 2R3 and then the corresponding column operation 
C3 -» -3C2 + 2C3 to obtain 

/-I l\ /-I 1^ 

2 3 12 and then 2 12 

\ -7 2-3-4/ \ -14 2 -3 -4, 

/O 2\ 
Now A has been diagonalized. Set P = 1 -3 ; then P'AP = 

\l 2 -4/ 

The signature S of 4 is the difference S = 1 — 2 = —1. 



12.9. ' Suppose 1 + 1 v^ in K. Give a formal algorithm to diagonalize (under congruence) 
a symmetric matrix A = (an) over K. 

Case I: aii¥=0. Apply the row operations fij -> — ajxi?x + OxxiJj, i = 2, . . .,n, and then the 

/ill 0" 
corresponding column operations Cj -* — Oji Cj + an C; to reduce A to the form ( 

^0 ^ 

Case II: a^ = but a^ ^ 0, for some i > 1. Apply the row operation Ri «^i2j and then the 
corresponding column operation Cj <-> Q to bring ctjj into the first diagonal position. This reduces 
the matrix to Case I. 

Case III: All diagonal entries Oji = 0. Choose i, j such that ay ¥= 0, and apply the row opera- 
tion i?i -> Rj + Rf and the corresponding column operation Ci-* Cj + Cj to bring 2ay # into the 
ith diagonal position. This reduces the matrix to Case II. 

/ail 0\ 
In each of the cases, we can finally reduce A to the form ( . _ ) where B is a symmetric 

matrix of order less than A. By induction we can finally bring A into diagonal form. 

Remark: The hypothesis that 1 + 1 #^ in K, is used in Case III where we state that 2ay ¥= 0. 



12.10. Let q be the quadratic form associated with the symmetric bilinear form /. Verify 
the following polar form of /: fiu,v) - ^q{u + v) - q{u) - q{v)). (Assume that 
1 + 1^0.) 

q(u + v) — q{u) — qiv) = f(u + v,u + v) — f{u, u) — f(v, v) 

= f(u, u) + f{u, v) + f{v, u) + fiy, v) - f{u, u) - f(v, v) 
= 2f{u,v) 
If 1 + 1 7^ 0, we can divide by 2 to obtain the required identity. 



12.11. Prove Theorem 12.4: Let / be a symmetric bilinear form on V over K (in which 
1 + 1^0). Then V has a basis {vi, . . .,Vn) in which / is represented by a diagonal 
matrix, i.e. f{Vi, v,) = for i ¥- j. 

Method 1. 

If / = or if dim V — 1, then the theorem clearly holds. Hence we can suppose f ¥= Q and 
dim V = n> 1. If q(v) = f(v, v) - for every v&V, then the polar form of / (see Problem 12.10) 
implies that / = 0. Hence we can assume there is a vector t?i e V such that f(Vi, v^ ¥= 0. Let 
U be the subspace spanned by Vi and let W consist of those vectors 1; G y for which /(^i, v) = 0. 
We claim that V = U ®W. 

(i) Proof that UnW = {0}: Suppose uG U nW. Since ue^U, u — kv^ for some scalar k&K. 
Since uGW, = f{u,u) = f(kvi,kvi) = k^ f(Vi,Vi). But /(^i.-Wi) ?^ 0; hence k-0 and there- 
fore u = kvi = 0. Thus UnW = {0}. 



CHAP. 12] BILINEAR, QUADRATIC AND HERMITIAN FORMS 271 

(il) Proof that V = U + W: Let vG V. Set 

Then f{v^,w) = f(v„v) - ;^^/(^i'^i) = " 

Thus w G W. By (1), v is the sum of an element of U and an element of W. Thus V = U + W. 
By (i) and (ii), V = U ® W. 

Now / restricted to W is a symmetric bilinear form on W. But dim W — n. — 1; hence by 
induction there is a basis {^2. • • • . v„} of W such that f(v^, Vj) = for i ¥= j and 2 — i, j — n. But 
by the very definition of W, fiv^, Vj) = for j = 2, . . .,n. Therefore the basis {v-^, . . . , v„} of V 
has the required property that /(-Uj, Vj) = for i ¥= j. 

Method 2. 

The algorithm in Problem 12.9 shows that every symmetric matrix over K is congruent to a 
diagonal matrix. This is equivalent to the statement that / has a diagonal matrix representation. 



12.12. Let A = I ^ 1 , a diagonal matrix over K. Show that: 



(i) for any nonzero scalars fci, . . . , fc„ e If , A is congruent to a diagonal matrix 

with diagonal entries aifcf ; 
(ii) if K is the complex field C, then A is congruent to a diagonal matrix with only 

I's and O's as diagonal entries; 
(iii) if K is the real field K, then A is congruent to a diagonal matrix with only 

I's, —I's and O's as diagonal entries. 

(i) Let P be the diagonal matrix with diagonal entries fc^. Then 

ptAP = I ^"2 11 02 





02^2 



O-nKl 



fl/Voi if «{ '^ 
(ii) Let P be the diagonal matrix with diagonal entries f>i — "] -. •« _ r, • Then P^AP has 



the required form. 

fl/A/hl if Oi^O 
(iii) Let P be the diagonal matrix with diagonal entries 6j = ■< _ . Then P^AP has 

the required form. ^ ' 

Remark. We emphasize that (ii) is no longer true if congruence is replaced by Hermitian con- 
gruence (see Problems 12.40 and 12.41). 



12.13. Prove Theorem 12.5: Let / be a symmetric bilinear form on V over R. Then there 
is a basis of V in which / is represented by a diagonal matrix, and every other diagonal 
representation of / has the same number of positive entries and the same number of 
negative entries. 

By Theorem 12.4, there is a basis {itj, . . . , u^} of V in which / is represented by a diagonal 
matrix, say, with P positive and N negative entries. Now suppose {wi, . . ., w„} is another basis of 
V in which / is represented by a diagonal matrix, say, with P' positive and N' negative entries. We 
can assume without loss in generality that the positive entries in each matrix appear first. Since 
rank (f) - P + N = P' + N', it suffices to prove that P = P'. 



272 BILINEAR, QUADRATIC AND HERMITIAN FORMS [CHAP. 12 

Let [7 be the linear span of u^, . .., up and let W be the linear span of Wp, + 1, . . . , m)„. Then 
f{v,v) > for every nonzero v e U, and f{v,v) ^ for every nonzero v & W. Hence 
UnW = {0}. Note that dimU = P and dimW = n- P'. Thus 

dim{U+W) = dimU + dimW - dim(UnW) = P + {n-P')-0 = P - P' + n 

But dim(U+W) ^ dim V = n; hence P-P' + n^n or P ^ P'. Similarly, P' ^ P and there- 
fore P = P', as required. 

Remark. The above theorem and proof depend only on the concept of positivity. Thus the 
theorem is true for any subfield K of the real field R. 

12.14. An nxn real symmetric matrix A is said to be positive definite if X*AX > for 
every nonzero (column) vector X G R", i.e. if A is positive definite viewed as a 
bilinear form. Let B be any real nonsingular matrix. Show that (i) B*B is sym- 
metric and (ii) B*B is positive definite. 

(i) {Btpy = ptP" = B<-B\ hence B'JS is symmetric. 

(ii) Since B is nonsingular, BX # for any nonzero X S R". Hence the dot product of BX with 

itself, BX-BX = (BXY(BX), is positive. Thus XHBtB)X = {XtBt){BX) = (BX)HBX) > as 

required. 

HERMITIAN FORMS 

12.15. Determine which of the following matrices are Hermitian: 

2-i 4 + i\ 
6 i 

i 3 / 

(ii) 

A matrix A = (Oj^) is Hermitian iff A = A*, i.e. iflf o.^ = 'ajl. 
(i) The matrix is Hermitian, since it is equal to its conjugate transpose, 
(ii) The matrix is not Hermitian, even though it is symmetric, 
(iii) The matrix is Hermitian. In fact, a real matrix is Hermitian if and only if it is symmetric. 

12.16. Let A be a Hermitian matrix. Show that / is a Hermitian form on C« where / is 
defined by f(X, Y)^X<^A Y. 

For all a, 6 e C and all X^, X^, Y e C", 

/(aXi + 6X2, y) = (aX^ + hX^YAY = (aX\+hXl)AY 

= aX\AY + bXlAY = af(Xi,Y) + bf(X2,Y) 

Hence / is linear in the first variable. Also, 

f{X, Y) = xTaY = (XTaYY = Wa^X = YtA*X = Yt A X = f(Y,X) 
Hence / is a Hermitian form on C". (Remark. We use the fact that X* AY is a. scalar and so it is 
equal to its transpose.) 

12.17. Let / be a Hermitian form on V. Let H be the matrix of / in a basis {d, . . . , e„} of 
V. Show that: 

(i) f{u,v) = [u]lH\v]e tor al\ u,vGV; 

(ii) if P is the transition matrix from {ei} to a new basis {e,'} of V, then B = P*HP 
(or: B-Q*HQ where Q = P) is the matrix of / in the new basis {e,'}. 

Note that (ii) is the complex analog of Theorem 12.2. 

(i) Let u,v GV and suppose 

u = ajCi + 0362 + • • • + a„e„ and v — b^ei + 62^2 + • • • + &««« 





CHAP. 12] 



BILINEAR, QUADRATIC AND HERMITIAN FORMS 



273 



Then f{u, v) = fia^Ci + • • • + a„e„, 6161 + • • ■ + 6„e„) 

as required. \ " 

(ii) Since P is the transition matrix from {ej to {e^}, then 

P[u],. = [u]„ P[v],, = [v], and so [m]^ = [m]*, P', [^= P ['^ 

Thus by (i), f(u,v) = {u]\H [v]^ — [u\l,P^HP [v]^,. But u and v are arbitrary elements of V; 
hence P* H P is the matrix of / in the basis {el}. 

1 1 + i 2i \ 
12.18. Let H = \ 1 - i 4 2 — 3i,a Hermitian matrix. Find a nonsingular ma- 

-2i 2 + Si 1 j 

trix P such that P*HP is diagonal. 

First form the block matrix {H, I): 

1 1 + i 2i 

1 - j 4 2 - 3i 

-2i 2 + 3i 7 

Apply the row operations R2 -* (—1 + 'i)Ri + R2 ^^^ ^3 ^ 2i/2i + i?3 to (A, /) and then the 
corresponding "Hermitian column operations" (see Problem 12.42) C2 -> (—1 — t)Ci + C2 and 
C3 -» — 2iCi + Cg to A to obtain 



and then 



Next apply the row operation R^ -> — SiBg + Zi^a and the corresponding Hermitian column opera- 
tion C3 -* hiC^ + 2C3 to obtain 










and then 





Now H has been diagonalized. Set 

'1 -1 + i 5 + 9i^ 
P = I 1 -hi 

,0 2 



and then P^HP 




Observe that the signature S of H is jS = 2 — 1 = 1. 



MISCELLANEOUS PROBLEMS 

12.19. Show that any bilinear form / on F is the sum of a symmetric bilinear form and a 
skew symmetric bilinear form. 

Set g(v,,v) = ^[f{u,v) + f(v,u)\ and h{u,v) = ^[f(u,v) — f(v,u)\. Then g is symmetric because 
g{u,v) = ^[f(u,v) + f(v,u)] = ^[f{v,u) + f{u,v)] = g(v,u) 
and /i is skew symmetric because 

h{u,v) = ^[f{u,v) — f{v,u)] = -^[/(v.m) -/(«,-!))] = -h(v,u) 

Furthermore, f — g + h. 



274 BILINEAR, QUADRATIC AND HERMITIAN FORMS [CHAP. 12 

12.20. Prove Theorem 12.3: Let / be an alternating bilinear form on V. Then there exists 
a basis of V in which / is represented by a matrix of the form 

1 I 
-1 I 

1-10 1 



L_oj 



T, 



To" 



1 



Moreover, the number of ( a M^ uniquely determined by / (because it is equal 

to i[rank (/)]). \'^ "/ 

If / = 0, then the theorem is obviously true. Also, if dim y = 1, then fik^u, fcaw) = 
fcifcj f(u, m) = and so / = 0. Accordingly we can assume that dim F > 1 and f ¥- 0. 

Since / # 0, there exist (nonzero) Wj, Wg G 7 such that /(mj, u^) # 0. In fact, multiplying 
Ml by an appropriate factor, we can assume that f{ui, u^ = 1 and so /(m2, "i) = —1- Now Ui and 
M2 are linearly independent; because if, say, u^ = fewj, then /(mj, u^) = /(mi, ku^) = k f(ui, u^) = 0. 
Let C7 be the subspace spanned by Ml and M2, i.e. U = L{ux,u^. Note: / n 1 

(i) the matrix representation of the restriction of / to ?7 in the basis {mi, Mg} is ( _ ^ ^ 

(ii) if uG U, say u = aui + 6M2, then 

/(m. Ml) = /(aMi + 6m2, Ml) = —6 

f(u, M2) = f(aui + bu2, M2) = "' 

Let TF consist of those vectors wGV such that /(w,Mi) = and /(w,M2) = 0. Equivalently, 

W = {wGV: f(w, m) = for every uGU} 

We claim that V=U®W. It is clear that UnW = {0}, and so it remains to show that 
V = U+W. Let veV. Set 

M = f{v, u^uy - /(■«, Mi)m2 and w = v -u U) 

Since M is a linear combination of Mi and Mj, uGU. We show that weW. By (i) and (ii), 
f(u,U]) = f(v,Ui); hence 

/(W, Ml) = f(v - U, Ml) = fiv, Ml) - /{m. Ml) = 

Similarly, /(m, M2) = f(v, Mg) and so 

f{w, M2) = /(f - M, M2) = /(v, U2) - /(m, M2) = 
Then w G TF and so, by (i), v = m + w where m G f/ and w&W. This shows that V^U+W\ 
and therefore y = ?7 © TF. 

Now the restriction ot f to W is an alternating bilinear form on W. By induction, there exists 
a basis M3, . . .,m„ of IF in which the matrix representing / restricted to W has the desired form. 
Thus Mi,M2,M3,. . .,M„ is a basis of V in which the matrix representing / has the desired form. 



(ii) Find the matrix of / in the basis -'' ^ °^ (^ ^^ 1^ °^ 1^ ° 



CHAP. 12] BILINEAR, QUADRATIC AND HERMITIAN FORMS 275 

Supplementary Problems 

BILINEAR FORMS 

12.21. Let M = (xi, x^ and v = (i/x, y^. Determine which of the following are bilinear forms on R^: 
(i) /(m, V) = 2a;i2/2 - SajjJ/i (iv) f(u, v) = X1X2 + j/iJ/2 

(ii) f(u, v) = xi + 3/2 (v) f{u, v) = 1 

(iii) /(m,v) = 3a;2V2 (vi) /(w, v) = 0. 

12.22. Let / be the bilinear form on R2 defined by 

/((«!, X2), (vi, 2/2)) = SxiVi - 2x12/2 + 4x2yi - a;2?/2 
(i) Find the matrix A of / in the basis {u^ = (1, 1), Mj = (1, 2)}. 
(ii) Find the matrix B of / in the basis {v^ = (1, —1), V2 = (3, 1)}. 
(iii) Find the transition matrix P from {mJ to {vj and verify that B = P* A P. 

12.23. Let V be the vector space of 2 X 2 matrices over B. Let Af = ( ^ j, and let f{A,B) = 

tr (At MB) where A,B&V and "tr" denotes trace, (i) Show that / is a bilinear form on V. 

'1 0\ /O 1\ /o 0\ /O 0^ 

o)'\o oj' [1 o)'[q 1, 

12.24. Let B{V) be the set of bilinear forms on V over K. Prove: 

(i) if f,gGB(V), then f + g and kf, for k G K, also belong to BiV), and so B(V) is a subspace 
of the vector space of functions from V X V into K; 

(ii) if and a are linear functionals on V, then f{u, v) = ,p{u) a(v) belongs to B{V). 

12.25. Let / be a bilinear form on V. For any subset S of V, we write 

S-*- = {ve^V: f{u,v) = for every mSS}, S""" = {vGV : f{v,u) = foreverywGS} 

Show that: (i) S"*" and S^ are subspaces of V; (ii) SjCSg implies S2 c st and sj C s} ; 
(iii) {0}-L ={0}T =y. 

12.26. Prove: If / is a bilinear form on V, then rank (/) = dim y - dimy-*" = dim y - dimy^ and 
hence dim y-*" = dim y^. 

12.27. Let / be a bilinear form on V. For each uGV, let u .V -^ K and u iV -* K be defined by 

u(x) = f(x,u) and u{x)=f(u,x). Prove: 

As. >^ ^ ^ 

(1) u and u are each linear, i.e. u,u G V*; 

(ii) u h* u and m M it are each linear mappings from V into V*; 

(iii) rank (/) = rank (m K m) = rank {u K tt). 

12.28. Show that congruence of matrices is an equivalence relation, i.e. (i) A is congruent to A; (ii) if A 
is congruent to B, then B is congruent to A; (iii) if A is congruent to B and B is congruent to C, 
then A is congruent to C. 

SYMMETRIC BILINEAR FORMS. QUADRATIC FORMS 

12.29. Find the symmetric matrix belonging to each of the following quadratic polynomials: 
(i) q(x, y, z) = 2x^ - Sxy + y^ - IGxz + liyz + 5«2 

(ii) q(x, y, z) = x^ — xz + y^ 

(iii) q(x, y, z) — xy + y^ + 4xz + z^ 

(iv) q(x, y, z) = xy + yz. 



276 BILINEAR, QUADRATIC AND HERMITIAN FORMS [CHAP. 12 

12.30. For each of the following matrices A, find a nonsingular matrix P such that P'^AP is diagonal: 

1 1-2-3^ 

2 3\ / I ! ^ \ (12-5-1 

-2-5 6 9 
-3 -1 9 11 




(i) ^ = (3 4)' («) A = -2 6 -9 . (iii) A = , 2 _g g 



In each case find the rank and signature. 

12.31. Let q be the quadratic form associated with a symmetric bilinear form /. Verify the following 
alternate polar form of /: f(u,v) = ^[q(u + v) — q(u — v)]. 

12.32. Let S(V) be the set of symmetric bilinear forms on V. Show that: 

(i) S{V) is a subspace of B(V); (ii) if dim V = n, then dim S(y) = ^n(n + 1). 

12.33. Let / be the symmetric bilinear form associated with the real quadratic form qix,y) = 
ax^ + bxy + cy^. Show that: 

(i) / is nondegenerate if and only if b^ — 4ac ¥= 0; 

(ii) / is positive definite if and only if a > and b^ — 4ac < 0. 

12.34. Suppose A is a real symmetric positive definite matrix. Show that there exists a nonsingular matrix 
P such that A = P«P. 

n 

12.35. Consider a real quadratic polynomial qix^, . . . , a;„) = 2 Oij^i^j, where ay = ttjj. 

ijj = 1 

(i) If an '^ 0, show that the substitution 

Xi = Vi (ai22/2 + • • ■ + «-ln2/n). «2 = 2/2. ••■' «« = 2/n 

aji 

yields the equation q{xi x^) = a^yl + q'iVz, ■ ■ -yVn), where q' is also a quadratic 

polynomial. 

(ii) If an = but, say, ajg ^ 0, show that the substitution 

xi-yi + 2/2. X2 = yi- V2> «3 = 2/3. • • • . ^n = Vn 
yields the equation q{xi,...,x„) = 2 Mi^/j- "^^^^^ ^n ^^ 0. i.e. reduces this case to case (i). 
This method of diagonalizing q is known as "completing the square". 

12.36. Use steps of the type in the preceding problem to reduce each quadratic polynomial in Problem 
12.29 to diagonal form. Find the rank and signature in each case. 

HERMITIAN FORMS 

12.37. For any complex matrices A, B and any k e C, show that: 

(i) ATB = A+B, (ii) M^JcA, (iii) AB = AB, (iv) A* = A^. 

12.38. For each of the following Hermitian matrices H, find a nonsingular matrix P such that P* H P is 
diagonal: , . ^^., 

Find the rank and signature in each case. 

12.39. Let A be any complex nonsingular matrix. Show that H = A*A is Hermitian and positive 
definite. 

12.40. We say that B is Hermitian congruent to A if there exists a nonsingular matrix Q such that 
B = Q*AQ. Show that Hermitian congruence is an equivalence relation. 



CHAP. 12] BILINEAR, QUADRATIC AND HERMITIAN FORMS 277 

12.41. Prove Theorem 12.7: Let / be a Hermitian form on V. Then there exists a basis {e^, . . .,e„} of V 
in which / is represented by a diagonal matrix, i.e. /(ej, ej) = for t # j. Moreover, every diagonal 
representation of / has the same number P of positive entries and the same number N of negative 
entries. (Note that the second part of the theorem does not hold for complex symmetric bilinear 
forms, as seen by Problem 12.12(ii). However, the proof of Theorem 12.5 in Problem 12.13 does 
carry over to the Hermitian case.) 

MISCELLANEOUS PROBLEMS 

12.42. Consider the following elementary row operations: 

[ai] Bi <^ Rj, [oj Ri -* kRi, k ¥- 0, [og] jB; ^ kRj + fij 
The corresponding elementary column operations are, respectively, 

Ih] Ci^Cj, [bzl Ci-^kCi,ky^O, [bs] d^kCj + Ct 
If K is the complex field C, then the corresponding Hermitian column operations are, respectively, 

[ci] Ci'^Ci, [cj Cj^feCj, fc^O, [cs] Ci-*kCj + Ci 

(i) Show that the elementary matrix corresponding to [6J is the transpose of the elementary 
matrix corresponding to [a^]. 

(ii) Show that the elementary matrix corresponding to [cj is the conjugate transpose of the ele- 
mentary matrix corresponding to [aj. 

12.43. Let V and W be vector spaces over K. A mapping / : VxW-^ K is called a bilinear form on V 
and W if: 

(i) f(avi + bv2, w) = af(vi, w) + bfiv^, w) 

(ii) f(v, owj + hWi) = af(v, Wj) + bf^v, w^ 

for every a,bQK, V; G V, w^ £ W. Prove the following: 

(i) The set B(V, W) of bilinear forms on V and W is a subspace of the vector space of functions 
from VXW into K. 

(ii) If {^1, ...,0^} is a basis of V* and {aj, ...,»„} is a basis of W*, then {/y : i=l,...,m, 
j = l,...,n} is a basis of B(V,W) where /y is defined by fij(v,w) - 4>i(v) aAw). Thus 
dim B(V, W) = dim F • dim W. 

{Remark. Observe that if V = W, then we obtain the space B{V) investigated in this chapter.) 

m times 



12.44. Let y be a vector space over K. A mapping f: VXVX---xV-*K is called a multilinear (or: 
m-linear) form on V if / is linear in each variable, i.e. for i = l, ...,m, 

f{...,au+bv, ...) = af(...,u, ...) + bf(...,v, ...) 

where ^ denotes the ith component, and other components are held fixed. An »i-linear form / is 
said to be alternating if 

f{vi, . . . , Vf^) = whenever Vj = v^, i^^ k 
Prove: 

(i) The set B^(V) of m-linear forms on V is a subspace of the vector space of functions from 
VXVX ■■■ XV into K. 

(ii) The set A,„(y) of alternating m-linear forms on V is a subspace of B^{V). 

Remark 1. If m = 2, then we obtain the space B(F) investigated in this chapter. 

Remark 2. If. V = K'", then the determinant function is a particular alternating m-linear form on V. 



278 



BILINEAR, QUADRATIC AND HERMITIAN FORMS 



[CHAP. 12 



Answers to Supplementary Problems 



12.21. (i) Yes (ii) No (iii) Yes (iv) No (v) No (vi) Yes 



12.22. (i) A = 



4 1 
7 3 



(ii) B 



-4 
20 32 



(iii) P = 



3 5 
-2 -2 



12.23. (ii) 




2 -4 -8' 
12.29. (i) 1-417 
1-8 7 5, 



(ii) 



1 -i^ 

10 

-i 0; 



'0 



(iii) 



i 2' 
i 1 
2 1/ 



'o i o' 

(iv) U 1 

.0 4 0, 



12.30. (i) P = 



(ii) P - 



(iii) P = 



1 -3 
2 



PtAP 



2 
-2 



S = 0. 



'1 2 0\ /^ ° ° 

1 3 , PtAP =02 

2/ loo -38 y 



S = 1. 



/l -1 -1 26 \ 
1 3 13 
19 

\o 7 



PtAP = 




S = 2. 



12.38. (i) P = 

(ii) P = 

(iii) P = 



1 A ptHP = ^1 » 



1 



1 -2 + 3i 
1 



1 i -3 + i 
1 i 

1 



1 



PtHP = 



PtHP 



S = 2. 



-14 



'1 0' 

10 

lO 0-4, 



S = 1. 



chapter 13 



Inner Product Spaces 



INTRODUCTION 

The definition of a vector space V involves an arbitrary field K. In this chapter we 
restrict K to be either the real field R or the complex field C. In the first case we call V a 
real vector space, and in the second case a complex vector space. 

Recall that the concepts of "length" and "orthogonality" did not appear in the investiga- 
tion of arbitrary vector spaces (although they did appear in Chapter 1 on the spaces R" and 
C"). In this chapter we place an additional structure on a vector space V to obtain an 
inner product space, and in this context these concepts are defined. 

We emphasize that V shall denote a vector space of finite dimension unless otherwise 
stated or implied. In fact, many of the theorems in this chapter are not valid for spaces of 
infinite dimension. This is illustrated by some of the examples and problems. 

INNER PRODUCT SPACES 

We begin with a definition. 

Definition: Let F be a (real or complex) vector space over K. Suppose to each pair of 
vectors u,v GV there is assigned a scalar {u, v) G K. This mapping is called 
an inner product in V if it satisfies the following axioms: 

[/i] {aui, + hu2, V) = a(ui, v) + h{U2, v) 

[h] {u,v) = (v/u) 

[la] {u, u) s= 0; and {u, m) = if and only if u = 0. 

The vector space V with an inner product is called an inner product space. 

Observe that {u,u) is always real by [72], and so the inequality relation in [h] makes 
sense. We also use the notation 

||m|| = '\/{u, u) 

This nonnegative real number ||m|| is called the norm or length of u. Also, using [/i] and 
[/2] we obtain (Problem 13.1) the relation 

{u, avi + bv2) = d{u, Vi) + b{u, V2) 

If the base field K is real, the conjugate signs appearing above and in [/2] may be ignored. 

In the language of the preceding chapter, an inner product is a positive definite sym- 
metric bilinear form if the base field is real, and is a positive definite Hermitian form if the 
base field is complex. 

A real inner product space is sometimes called a Euclidean space, and a complex inner 
product space is sometimes called a unitary space. 

279 



280 



INNER PRODUCT SPACES 



[CHAP. 13 



Example 13.1 : Consider the dot product in R": 

M • v = ajfei + a262 + • • • + Ok6„ 

where u — (aj) and v = (6;). This is an inner product on R", and R" with this 
inner product is usually referred to as Euclidean n-spa-ce. Although there are 
many different ways to define an inner product on R" (see Problem 13.2), we shall 
assume this inner product on R" unless otherwise stated or implied. 



Example 13.2: 



Example 13.3: 



Example 13.4: 



Example 13.5: 



Consider the dot product on C": 

where u — (Zj) and v = (Wj). As in the real case, this is an inner product on C" 
and we shall assume this inner product on C" unless otherwise stated or implied. 

Let V denote the vector space oi mXn matrices over R. The following is an inner 

product in V: 

(A,B) = tr (B*A) 

where tr stands for trace, the sum of the diagonal elements. 

Analogously, if U denotes the vector space of m X n matrices over C, then the 
following is an inner product in U: 

{A,B} = tr(B*A) 
As usual, B* denotes the conjugate transpose of the matrix B. 

Let V be the vector space of real continuous functions on the interval a — t — b. 
Then the following is an inner product on V: 

{f,g) = f m)g(t)dt 

•'a 

Analogously, if V denotes the vector space of complex continuous functions on the 
(real) interval a - t - h, then the following is an inner product on U: 

if, 9) = f m^dt 

•'a 

Let V be the vector space of infinite sequences of real numbers (oj, a2, . . .) satisfying 

2,2 



2 of = ai + a2 + 
t=i 



< M 



i.e. the sum converges. Addition and scalar multiplication are defined component- 
wise: 

(tti.aa, ...) + (61,62. • ••) = (011 + 61,02+62,...) 

fe(ai, a2, . ■ .) = (.kai, ka2, ■ ■ ■) 

An inner product is defined in V by 

<(«!, tta, ...), (61, 62, ... )) = ai6i + a2&2 + ' ■ ' 

The above sum converges absolutely for any pair of points in V (Problem 13.44); 
hence the inner product is well defined. This inner product space is called J2-space 
(or: Hilbert space). 

Remark 1: If ||f|| = 1, i.e. if {v,v) = 1, then v is called a unit vector or is said to be 
normalized. We note that every nonzero vector u &V can be normalized by 
setting V = m/||%||. 



Remark 2: The nonnegative real number d{u,v) = \\v-u\\ is called the distance between 
u and v; this function does satisfy the axioms of a metric space (see Problem 
13.51). 



CHAP. 13] 



INNER PRODUCT SPACES 



281 



CAUCHY-SCHWARZ INEQUALITY 

The following formula, called the Cauchy-Schwarz inequality, is used in many branches 
of mathematics. 

Theorem 13.1: (Cauchy-Schwarz): For any vectors u,vGV, 

\{u,v)\ — ||m|| II^II 

Next we examine this inequality in specific cases. 

Example 13.6: Consider any complex numbers aj, . . .,a„, 6i, . . ., 6„ G C. Then by the Cauchy- 
Schwarz inequality, 

(ai6i+---+a„6„P =^ iWi\^ + ■ ' ■ + KMh]^ + ■ ■ ■ + K\^) 

that is, (m-v)2 ^ Ikll^ll-ulP 

where u = (aj) and v = (6;). 

Example 13.7; Let / and g be any real continuous functions defined on the unit interval — t — 1. 
Then by the Cauchy-Schwarz inequality, 

«/,ff»2 = (^f f{t)g(t)dty =s ^^ P(t)dt ^\\t)dt = \W\\g\\^ 

Here V is the inner product space of Example 13.4. 



ORTHOGONALITY 

Let V be an inner product space. The vectors u,v GV are said to be orthogonal if 
{u/v) = 0. The relation is clearly symmetric; that is, if u is orthogonal to v, then {v, u) = 
{u,v} = = and so v is orthogonal to u. We note that e V is orthogonal to every 

V e 7 for 

{0,v) ^ {Qv,v) = 0{v,v) = 

Conversely, if u is orthogonal to every v GV, then (u.u) = and hence m = by [1 3]. 

Now suppose W is any subset of V. The orthogonal complement of W, denoted by W-^ 
(read "W perp") consists of those vectors in V which are orthogonal to every w GW: 

W-'- = {v GV: (v,w) = for every w GW} 

We show that W^ is a subspace of V. Clearly, GW^. Now suppose u,v G W-^. Then 
for any a,h GK and any w GW, 

{au + bv, w) = a{u, w) + h{v, w) - a-0 + b'0 = 



Thus au + bv €. 
Theorem 13i!: 



W and therefore W is a subspace of V. 
Let W he a subspace of V. Then V is the direct sum of W and W^ , 

v^w®w^. 



I.e. 



Now if PF is a subspace of V, then V =W ® W-^ by the above theorem; hence there is 
a unique projection Ew: F -> 7 with image W and kernel 
W-^ . That is, if v GV and v = w + w', where w GW, 
w' G W^, then Ew is defined by Ew{v) = w. This mapping 
Evf is called the orthogonal projection of V onto W. 

Example 13.8: Let W be the z axis in B^, i.e. 

W = {(0,0,c): cSR} 

Then W is the xy plane, i.e. 

W^ = {{a, 6, 0) : a, 6 e R} 

As noted previously, R^ = W ® W . The 
orthogonal projection E of R^ onto W is given 
by E{x, y, z) = (0, 0, z). 




282 INNER PRODUCT SPACES [CHAP. 13 

Example 13.9: Consider a homogeneous system of linear equations over R: 

ffliia;! + fflijaig + • • • + »!„»;„ = 

a2iXi + 022*2 + • • • + a2n«7i = 



ttml*! + «m2«'2 + 

or in matrix notation AX = 0. Recall that the solution space W may be viewed 
as the kernel of the linear operator A. We may also vievs^ W as the set of all vectors 
V = (xi, ...,«„) vrhich are orthogonal to each row of A. Thus W is the orthogonal 
complement of the row space of A. Theorem 13.2 then gives another proof of the 
fundamental result: dim W = n — rank (A). 

Remark: If F is a real inner product space, then the angle between nonzero vectors 

u,v GV is defined by 

(u, V) 
cos 6 - 



\\u\\m 



By the Cauchy-Schwarz inequality, -1 =^ cos ^ ^ 1 and so the angle 6 always 
exists. Observe that u and v are orthogonal if and only if they are "perpendicu- 
lar", i.e. e - 7r/2. 



ORTHONORMAL SETS 

A set {Ui} of vectors in V is said to be orthogonal if its distinct elements are orthogonal, 
i.e. if {Vd, Uj) = for i¥= j. In particular, the set {Ui} is said to be orthonormal if it is 
orthogonal and if each Ui has length 1, that is, if 

r for i¥' j 

{Ui, Uj) = Sij = < . . 

[1 for I — J 

An orthonormal set can always be obtained from an orthogonal set of nonzero vectors by 
normalizing each vector. 

Example 13.10: Consider the usual basis of Euclidean 3-space R^: 

{ei = (1, 0, 0), 62 = (0. 1. 0). «3 = (0. 0, 1)} 
It is clear that 

(ei, ej) - (62, 62) = (63, 63) = 1 and (fij, e^) = for i¥=j 

That is, {ei, e^, 63} is an orthonormal basis of R^. More generally, the usual basis 
of R" or of C" is orthonormal for every n. 

Example 13.11: Let V be the vector space of real continuous functions on the interval -v-t-v 
with inner product defined by {f,g)= I f(t) g{t) dt. The following is a classi- 

cal example of an orthogonal subset of V: 

{1, cos t, cos 2«, . . ., sin t, sin 2t, . . .} 
The above orthogonal set plays a fundamental role in the theory of Fourier series. 

The following properties of an orthonormal set will be used in the next section. 
Lemma 13^: An orthonormal set (mi, ...,Ur}\s linearly independent and, for any v&V, 
the vector ^^ _ ^ _ ^^^ ^^^^^ _ ^^^ ^2)^2 - ■■■ - {v, Ur)Ur 

is orthogonal to each of the Uu 



CHAP. 13] INNER PRODUCT SPACES 283 

GRAM-SCHMIDT ORTHOGONALIZATION PROCESS 

Orthonormal bases play an important role in inner product spaces. The next theorem 
shows that such a basis always exists; its proof uses the celebrated Gram-Schmidt orthog- 
onalization process. 

Theorem 13.4: Let {vi, . . .,Vn} be an arbitrary basis of an inner product space V. Then 
there exists an orthonormal basis {ui, . . . , tt„} of V such that the transition 
matrix from {Vi} to {%} is triangular; that is, for i = l, . . .,n, 

Ui = OiiVi + 012^2 + ■ • • + auVi 

Proof. We set Ui = Vi/||t;i||; then {ui} is orthonormal. We next set 
W2 = V2 — {V2,Ui)Ui and M2 = W2/||w2|| 
By Lemma 13.3, W2 (and hence U2) is orthogonal to Ui; then {ui,v^} is orthonormal. We next 

ggf- 

W3 = va — {V3,Ui)Ui — {vs,U2)U2 and Uz = Ws/\\ws\\ 

Again, by Lemma 13.3, Wa (and hence Ua) is orthogonal to Ui and U2; then {ui, iia, Ua} is ortho- 
normal. In general, after obtaining {ui, ...,2*1} we set 

Wi+i = Vi+i — {Vi+i,ui)ui — ••■ — (Vi+1, 1^)111 and th+i = Wi+i/\\Wi+i\\ 

(Note that Wi+i¥'0 because Vi+i€ L{vi, . . .,Vi) — L{ui, . . .,ik).) As above, (tti, . . .,Mi+i} 
is also orthonormal. By induction we obtain an orthonormal set {ui, . . . , ztn} which is in- 
dependent and hence a basis of V. The specific construction guarantees that the transition 
matrix is indeed triangular. 

Example 13.12: Consider the following basis of Euclidean space R^: 

{^1 = (1, 1, 1), V2 = (0, 1, 1), ■V3 = (0, 0, 1)} 

We use the Gram-Schmidt orthogonalization process to transform {vj into an ortho- 
normal basis {Ui). First we normalize v^, i.e. we set 

JVi_ ^ (1,1,1) ^ /J 1_ _1_ 

Next we set 

U,2 = .2-<^.%)% = (0,1,1) -j=(^j=.^,jl'j = (-f,|,| 

and then we normalize Wg, i.e. we set 

W2 _ / 2 1_ J_ 

"^ ~ IKII ~ [ Ve'Ve'Ve 

Finally we set 

W3 = '"a - <'>'3.Ml>Ml - {'«3,U2)U2 

= (0,0,1)-— (^-^,—,-^j--^^--|.-^,-|j = (o,-|,| 

and then we normalize Wg: 

_ Wa I 11 



\/2 y/2 



The required orthonormal basis of R^ is 



LINEAR FUNCTIONALS AND ADJOINT OPERATORS 

Let V be an inner product space. Each u&V determines a mapping u:V-^K defined by 

u{v) — {v,u) 



284 INNER PRODUCT SPACES [CHAP. 13 

Now for any a,b G K and any Vi, V2 G V, 

u(avi + bv2) — {avi + bv2,u) = a{vi,u} + b{V2,u) = au{vi) + bu{v2) 

That is, M is a linear functional on V. The converse is also true for spaces of finite dimen- 
sion and is an important theorem. Namely, 

Theorem 13.5: Let ^ be a linear functional on a finite dimensional inner product space V. 
Then there exists a unique vector uGV such that (j>{v) = (v, u) for every 
v&V. 

We remark that the above theorem is not valid for spaces of infinite dimension (Problem 
13.45), although some general results in this direction are known. (One such famous result 
is the Riesz representation theorem.) 

We use the above theorem to prove 

Theorem 13.6: Let T be a linear operator on a finite dimensional inner product space V. 
Then there exists a unique linear operator T* on F such that 

{T{u),v) = {u,T*{v)) 

for every m, v G V. Moreover, if A is the matrix of T relative to an 
orthonormal basis {Ci] of V, then the conjugate transpose A* of A is the 
matrix of T* in the basis {ei}. 

We emphasize that no such simple relationship exists between the matrices representing 
T and T* if the basis is not orthonormal. Thus we see one useful property of orthonormal 
bases. 

Definition: A linear operator T on an inner product space V is said to have an adjoint 
operator T* on V if {T{u),v) = {u,T*{v)) for every u,vGV. 

Thus Theorem 13.6 states that every operator T has an adjoint if V has finite dimension. 
This theorem is not valid if V has infinite dimension (Problem 13.78). 

Example 13.13: Let T be the linear operator on C^ defined by 

T(x, y, z) = {2x + iv,y- Mz, x + {l-i)y + Sz) 
We And a similar formula for the adjoint T* of T. Note (Problem 7.3) that the 
matrix of T in the usual basis of C^ is 

/2 i 

[r] = 1 -5i 
\1 1-t 3 

Recall that the usual basis is orthonormal. Thus by Theorem 13.6, the matrix of T* 
in this basis is the conjugate transpose of [T]: 

[T*] = 

Accordingly, 

T*(x,y,z) = (2x + z,-ix + y + {l + i)z,5iy + Bz) 

The following theorem summarizes some of the properties of the adjoint. 
Theorem 13.7: Let S and T be linear operators on V and let kGK. Then: 

(i) (S + T)* = S* + T* (iii) (ST)* = T*S* 

(ii) (kT)* = kT* (iv) (T*)* = T 




CHAP. 13] 



INNER PRODUCT SPACES 



285 



ANALOGY BETWEEN A{V) AND C, SPECIAL OPERATORS 

Let A(V) denote the algebra of all linear operators on a finite dimensional inner product 
space V. The adjoint mapping T^^ T* on A{V) is quite analogous to the conjugation map- 
ping zi-* z on the complex field C. To illustrate this analogy we identify in the following 
table certain classes of operators T G A(V) whose behavior under the adjoint map imitates 
the behavior under conjugation of familiar classes of complex numbers. 



Class of 
complex numbers 


Behavior under 
conjugation 


Class of 
operators in A(V) 


Behavior under 
the adjoint map 


Unit circle (|z| = 1) 


z - 1/z 


Orthogonal operators (real case) 
Unitary operators (complex case) 


r* = r-i 


Real axis 


z = z 


Self -ad joint operators 

Also called: 

symmetric (real case) 
Hermitian (complex case) 


T* = T 


Imaginary axis 


z — —z 


Skew-adjoint operators 

Also called: 

skew-symmetric (real case) 
skew-Hermitian (complex case) 


r* =: -T 


Positive half axis 
(0,«) 


z — WW, w =/= 


Positive definite operators 


T = S*S 
with S nonsingular 



The analogy between these classes of operators T and complex numbers z is reflected in 
the following theorem. 

Theorem 13.8: Let A be an eigenvalue of a linear operator T on V. 
(i) If T* = T-\ then |A.| = 1. 
(ii) If T* = T, then k is real. 
(iii) If T* = —T, then X is pure imaginary. 
(iv) If r = S*S with S nonsingular, then X is real and positive. 

We now prove the above theorem. In each case let v be a nonzero eigenvector of T 
belonging to X, that is, T{v) - Xv with v y^ 0; hence {v, v) is positive. 

Proof of (i): We show that XX{v,v) - {v,v): 

XX{v,v) = {XV, XV) ^ {T(v),T{v)) = {v,T*T{v)) = {v,I(v)) = {v,v) 

But {v,v) ¥- 0; hence XX =1 and so |A.| = 1. 

Proof of (ii): We show that X{v, v) = X{v, v): 

X{v,v) = {XV, v) = {T{v),v) = {v,T*{v)) = {v,T{v)) = {v,Xv) = X{v,v) 

But {v, v) ¥- 0; hence A = X and so X is real. 

Proof of (iii): We show that X{v,v) = —X{v,v): 

X{v,v) = {XV, V) = {T{v),v) = {v,T*{v)) = {v,-T{v)) = {v,-Xv) = -X{v,v} 

But {v,v} ¥- 0; hence X = —X or A. = —A, and so A is pure imaginary. 



286 INNER PRODUCT SPACES [CHAP. IS 

Proof of (iv): Note first that S{v)¥'0 because S is nonsingular; hence {S{v),S{v)) is 
positive. We show that X{v,v) = {S{v),S(v)y. 

\{v,v) = {\v,v) = {T{v),v) = {S*S{v),v} = {S{v),S{v)) 
But {v,v) and {S{v),S{v)) are positive; hence A is positive. 

We remark that all the above operators T commute with their adjoint, that is, 
TT* = T*T. Such operators are called normal operators. 

ORTHOGONAL AND UNITARY OPERATORS 

Let U he a. linear operator on a finite dimensional inner product space V. As defined 

above, if 

U* = C/-1 or equivalently UU* = U*U = / 

then Z7 is said to be orthogonal or unitary according as the underlying field is real or com- 
plex. The next theorem gives alternate characterizations of these operators. 

Theorem 13.9: The following conditions on an operator U are equivalent: 
(i) [/* = U-\ that is, UU* ^U*U = I. 
(ii) U preserves inner products, i.e. for every v,w GV, 

{U{v),U{w)) = {v,w) 

(iii) U preserves lengths, i.e. for every v &V, \[U{v)\\ = ||t;||. 

Example 13.14: Let T : RS ^ R3 be the linear operator which ^ T(v) 

rotates each vector about the z axis by a fixed 
angle e: 

T(x, y, z) - {x cose — y sin e, 

X sine + y cos 9, z) 

Observe that lengths (distances from the ori- 
gin) are preserved under T. Thus T is an 
orthogonal operator. 

Example 13.15: Let V be the ^space of Example 13.5. Let TiV^V be the linear operator de- 
fined by r(cii,a2. ■ ••) = (0, ai,<i2. ■•■)• Clearly, T preserves inner products and 
lengths. However, T is not surjective since, for example, (1, 0, 0, ... ) does not belong 
to the image of T; hence T is not invertible. Thus we see that Theorem 13.9 is not 
valid for spaces of infinite dimension. 

An isomorphism from one inner product space into another is a bijective mapping 
which preserves the three basic operations of an inner product space: vector addition, 
scalar multiplication, and inner products. Thus the above mappings (orthogonal and 
unitary) may also be characterized as the isomorphisms of V into itself. Note that such a 
mapping U also preserves distances, since 

\\U{v)-Uiw)\\ = \\U{v-w)\\ = \\v-w\\ 
and so U is also called an isometry. 

ORTHOGONAL AND UNITARY MATRICES 

Let C7 be a linear operator on an inner product space V. By Theorem 13.6 we obtain the 
following result when the base field K is complex. 

Theorem 13.10A: A matrix A with complex entries represents a unitary operator U 
(relative to an orthonormal basis) if and only if A* = A'^. 
On the other hand, if the base field K is real then A* = A'; hence we have the follow- 
ing corresponding theorem for real inner product spaces. 




CHAP. 13] INNER PRODUCT SPACES 287 

Theorem 13.10B: A matrix A with real entries represents an orthogonal operator U 
(relative to an orthonormal basis) if and only if A* = A~^. 

The above theorems motivate the following definitions. 

Definition: A complex matrix A for which A* = A"S or equivalently A A* = A*A = /, 
is called a unitary matrix. 

Definition: A real matrix A for which A* — A~^, or equivalently AA* = A*A = I, is 
called an orthogonal matrix. 

Observe that a unitary matrix with real entries is orthogonal. 

Example 13.16: Suppose A = ( ' ^ ) is a unitary matrix. Then A A* = I and hence 

AA* = /"' "A/^i *i\ = / l^il' + l^zP ttifti + ajftaA ^ /l o' ^ ^ 



6i h^\a2 62/ \aj6i + a262 |6iP+|62l^/ \0 1 

Thus 

|ai|2 + laaP = 1, |6i|2 + Ibgp = 1 and aj)^ + a^z = 

Accordingly, the rows of A form an orthonormal set. Similarly, A*A = I forces 
the columns of A to form an orthonormal set. 

The result in the above example holds true in general; namely, 

Theorem 13.11: The following conditions for a matrix A are equivalent: 
(i) A is unitary (orthogonal). 
(ii) The rows of A form an orthonormal set. 
(iii) The columns of A form an orthonormal set. 

Example 13.17: The matrix A representing the rotation T in Example 13.14 relative to the usual 
basis of R3 is 

(cos e — sin B 
sin e cos 9 
1, 

As expected, the rows and the columns of A each form an orthonormal set; that is, 
A is an orthogonal matrix. 



CHANGE OF ORTHONORMAL BASIS 

In view of the special role of orthonormal bases in the theory of inner product spaces, 
we are naturally interested in the properties of the transition matrix from one such basis 
to another. The following theorem applies. 

Theorem 13.12: Let {ei, ...,««} be an orthonormal basis of an inner product space Y. 
Then the transition matrix from {ei) into another orthonormal basis is 
unitary (orthogonal). Conversely, if P = (Oij) is a unitary (orthogonal) 
matrix, then the following is an orthonormal basis: 

{el = aiiCi + 02162 + ■ • • + a„ie„ : i = 1, . . . , n} 

Recall that matrices A and B representing the same linear operator T are similar, i.e. 
B — P~^AP where P is the (nonsingular) transition matrix. On the other hand, if V is 
an inner product space, we are usually interested in the case when P is unitary (or orthog- 
onal) as suggested by the above theorem. (Recall that P is unitary if P* = P""S and P is 
orthogonal if P* = p-\) This leads to the following definition. 



288 INNER PRODUCT SPACES [CHAP. 13 

Definition: Complex matrices A and B are unitarily equivalent if there is a unitary matrix 
P for which B - P*AP. Analogously, real matrices A and B are orthogonally 
equivalent if there is an orthogonal matrix P for which B = P*AP. 

Observe that orthogonally equivalent matrices are necessarily congruent (see page 262). 

POSITIVE OPERATORS 

Let P be a linear operator on an inner product space V. P is said to be positive (or: 

semi-definite) if „ „ „ , rt 

P = S*S for some operator & 

and is said to be positive definite if S is also nonsingular. The next theorems give alternate 
characterizations of these operators. 

Theorem 13.13 A: The following conditions on an operator P are equivalent: 
(i) P = T^ for some self -adjoint operator T. 
(ii) P = S*S for some operator -S. 
(iii) P is self -adjoint and (P(m), u)^0 for every uGV. 

The corresponding theorem for positive definite operators is 

Theorem 13.13B: The following conditions on an operator P are equivalent: 
(i) P-T^ for some nonsingular self -ad joint operator T. 
(ii) P - S*S for some nonsingular operator S. 
(iii) P is self -adjoint and {P{u),u) > for every w ^ in V. 

DIAGONALIZATION AND CANONICAL FORMS IN EUCLIDEAN SPACES 

Let r be a linear operator on a finite dimensional inner product space V over K. Rep- 
resenting r by a diagonal matrix depends upon the eigenvectors and eigenvalues of T, 
and hence upon the roots of the characteristic polynomial A(t) of T (Theorem 9.6). Now 
^^t) always factors into linear polynomials over the complex field C, but may not have any 
linear polynomials over the real field R. Thus the situation for Euclidean spaces (where 
X = R) is inherently different than that for unitary spaces (where K = C); hence we treat 
them separately. We investigate Euclidean spaces below, and unitary spaces m the next 
section. 

Theorem 13.14: Let T be a symmetric (self -ad joint) operator on a real finite dimensional 
inner product space V. Then there exists an orthonormal basis of V 
consisting of eigenvectors of T; that is, T can be represented by a 
diagonal matrix relative to an orthonormal basis. 

We give the corresponding statement for matrices. 

Alternate Form of Theorem 13.14: Let A be a real symmetric matrix. Then there exists 

an orthogonal matrix P such that B = P-^AP = P*AP 
is diagonal. 

We can choose the columns of the above matrix P to be normalized orthogonal eigen- 
vectors of A; then the diagonal entries of B are the corresponding eigenvalues. 



CHAP. 13] INNER PRODUCT SPACES 289 

/ 2 -2\ 
Example 13.18: Let A = ( j . We find an orthogonal matrix P such that P^AP is diagonal. 



The characteristic polynomial A(t) of A is 



A(t) = \tI-A\ = 



t-2 2 
2 f- 5 



(«-6)(t-l) 



The eigenvalues of A are 6 and 1. Substitute « = 6 into the matrix tl - A to 
obtain the corresponding homogeneous system of linear equations 

4x + 2y = 0, 2x + 2/ = 

A nonzero solution is v^ = (1, —2). Next substitute t = 1 into the matrix tl ~ A 
to find the corresponding homogeneous system 

-X + 2y = 0, 2x - 4y = 

A nonzero solution is (2, 1). As expected by Problem 13.31, v^ and V2 are orthogonal. 
Normalize Vi and V2 to obtain the orthonormal basis 

{«! = (1/-V/5, -2/V5), M2 = (2/V5, 1/V5)} 

Finally let P be the matrix whose columns are u^ and U2 respectively. Then 

/ I/V5 2/V5\ /6 

P = and P-iAP = PtAP = ( 

V-2/X/5 I/V5/ VO 1 

As expected, the diagonal entries of P*AP are the eigenvalues corresponding to the 
columns of P. 

We observe that the matrix B - P~^AP = P*AP is also congruent to A. Now if q is 
a real quadratic form represented by the matrix A, then the above method can be used 
to diagonalize q under an orthogonal change of coordinates. This is illustrated in the 
next example. 

Example 13.19: Find an orthogonal transformation of coordinates virhich diagonalizes the quadratic 

form q{x, y) = 2x^ — 4xy + 5y^. 

/ 2 -2\ 
The symmetric matrix representing Q is A = I ) . In the preceding 

example we obtained the orthogonal matrix ^ ' 

/ I/V5 2/\/5\ /fi 

P = for which PtAP = ( „ " 

V-2/V/5 1/V5/ Vo 1 

(Here 6 and 1 are the eigenvalues of A.) Thus the required orthogonal transforma- 
tion of coordinates is 

x\ /x'\ X = a;7\/5 + 22/'V5 

) = P[ ■>,') that is, ^ ^ 

Under this change of coordinates q is transformed into the diagonal form 

q(x',y') = 6x'^ + y'^ 

Note that the diagonal entries of q are the eigenvalues of A. 

An orthogonal operator T need not be symmetric, and so it may not be represented by 
a diagonal matrix relative to an orthonormal basis. However, such an operator T does have 
a simple canonical representation, as described in the next theorem. 

Theorem 13.15: Let T" be an orthogonal operator on a real inner product space V. Then 
there is an orthonormal basis with respect to which T has the following 
form: 



290 INNER PRODUCT SPACES 



[CHAP. 13 




COS Or — sin Or 
sin Or cos Or 



The reader may recognize the above 2 by 2 diagonal blocks as representing rotations in 
the corresponding two-dimensional subspaces. 



DIAGONALIZATION AND CANONICAL FORMS IN UNITARY SPACES 

We now present the fundamental diagonalization theorem for complex inner product 
spaces, i.e. for unitary spaces. Recall that an operator T is said to be normal if it com- 
mutes with its adjoint, i.e. if TT* = T* T. Analogously, a complex matrix A is said to be 
normal if it commutes with its conjugate transpose, i.e. if AA* = A*A. 



Example 13.20: Let A = ( . „,„.). Then 

\i S + 2iJ\l S-2iJ 



2 3 - 3i 

3 + 3i 14 



A*A = (^ -' V^ M = ^ ' ^"^* 

\1 S-2iJ\i 3 + 2iJ \3 + 3i 14 

Thus A is a normal matrix. 

The following theorem applies. 

Theorem 13.16: Let T be a normal operator on a complex finite dimensional inner product 
space V. Then there exists an orthonormal basis of V consisting of 
eigenvectors of T; that is, T can be represented by a diagonal matrix 
relative to an orthonormal basis. 

We give the corresponding statement for matrices. 

Alternate Form of Theorem 13.16: Let A be a normal matrix. Then there exists a uni- 
tary matrix P such that B — P~^AP = P*AP is diagonal. 

The next theorem shows that even non-normal operators on unitary spaces have a 
relatively simple form. 

Theorem 13.17: Let T be an arbitrary operator on a complex finite dimensional inner 
product space V. Then T can be represented by a triangular matrix 
relative to an orthonormal basis of V. 



CHAP. 13] INNER PRODUCT SPACES 291 

Alternate Form of Theorem 13.17: Let A be an arbitrary complex matrix. Then there 
exists a unitary matrix P such that B = p-^AP = P*AP is triangular. 

SPECTRAL THEOREM 

The Spectral Theorem is a reformulation of the diagonalization Theorems 13.14 and 13.16. 

Theorem 13.18 (Spectral Theorem): Let T be a normal (symmetric) operator on a com- 
plex (real) finite dimensional inner product space V. Then there exist 
orthogonal projections Ei, . ..,Er on V and scalars Xi, ...,\r such that 

(i) T = XiEi + X2E2 + • • • + XrEr 

(ii) E1 + E2+ •■■ -^Er ^ I 
(iii) EiEj = for i¥= j. 

The next example shows the relationship between a diagonal matrix representation and 
the corresponding orthogonal projections. 

'2 
Example 13.21 : Consider a diagonal matrix, say A = | | . Let 



The reader can verify that the Ei are projections, i.e. fif = Bj, and that 

(i) A = 2£7i + 3^2 + 5^3. (ii) ^1 + ^2 + ^3 = I, (iii) E^^ = for i¥= j 



Solved Problems 

INNER PRODUCTS 

13.1. Verify the relation {u, avi + hvi) = d{u, Vi) + b{u, V2). 

Using [1 2], [Ii] and then [I2], we find 



{u, avi + bv2) = {avi + 6^2, u) = a{Vi,u) + b{V2, u) 

= d <«!,«> + 6 {^2, u) = a(u,Vi) + 6<M, V2> 

13.2. Verify that the following is an inner product in R^: 

{u, V) = XiVi, - xiy2 - X2yi + 3x2y2, where u = {xi, X2), v = (2/1, 1/2). 

Method 1. 

We verify the three axioms of an inner product. Letting w = («i, Zz), we find 
au + bw = a{xi, Wj) + H^i, Z2) = (ffl«i + bzi, aX2 + 623) 



292 INNER PRODUCT SPACES [CHAP. 13 

Thus 

{au + bw, V) = {(axi + bzi, ax^ + 622), (2/1, V2)) 

= {ax^ + bziivi - (axi + 621)2/2 - (ax2 + 6x2)2/1 + 3(aa;2 + 623)2/2 
= a(a;i2/i - a;iJ/2 - a!22/i + 3a;2?/2) + 6(212/1-213/2-222/1 + 3222/2) 
= a(u,v) + b{w,v) 
and so axiom [/i] is satisfied. Also, 

{v,u) = j/i^i - 2/1X2 — 2/2*^1 + %2*2 = «i2/i — a'i2/2 ~ *22/i + 3a;22/2 = {u,v} 
and axiom [/2] is satisfied. Finally, 

(M,M> = x^- 2x1X2 + 3x1 = Xj - 2a;ia;2 + a;| + 2x| = (x^-Xi)^ + 2x1 ~ ° 
Also, <M, u) — if and only if x-^ = 0, a!2 — 0, i.e. m = 0. Hence the last axiom [I^] is satisfied. 

Method 2. 

We argue via matrices. That is, we can write {u, v) in matrix notation: 

1 -l\/2/i^ 



{u,v) = utAv = (a;i,a;2)i , o m 

\-l 3/\2/2 

and so [/i] holds. Since A is symmetric, [/2] holds. Thus we need only show that A is positive 
definite. Applying the elementary row operation iZg ~* ^1 + ^2 *iid then the corresponding ele- 
mentary column operation C2 ^ Ci + C2, we transform A into diagonal form ( j . Thus A 
is positive definite and [Ig] holds. ^ ' 



13.3. Find the norm of 1; = (3, 4) G R'^ with respect to: 

(i) the usual inner product, (ii) the inner product in Problem 13.2. 

(i) ||i;||2 = {v,v) = <(3,4), (3,4)> = 9 + 16 = 25; hence l|vll = 5. 

(ii) ||i;||2 = {v,v} = ((3,4), (3,4)> = 9 - 12 - 12 + 48 = 33; hence \\v\\ = a/sI. 

13.4. Normalize each of the following vectors in Euclidean space R^: 

(i)M = (2,l,-l), {n)v = {hh-i). 

(i) Note {u, u) is the sum of the squares of the entries of u; that is, (u, u) — 2^ + 12 + (—1)2 = 6. 
Hence divide m by ||m|| = V<m. w) = \/6 to obtain the required unit vector: 



ul\\u\\ = (2/V6, l/Ve, -1/a/6) 

tear" of fractions: 12i; = (6, 8 
;he required unit vector is 

12v/||12i;|| = (6/\/l09, 8/a/109. -3/\/i09 ) 



(ii) First multiply v by 12 to "clear" of fractions: 12i; = (6, 8, -3). We have <12v, 12t)) 
62 + 82 + (—3)2 = 109. Then the required unit vector is 



13.5. Let V be the vector space of polynomials with inner product given by {f,g) = 

Cf{t)g{t)dt. Let f{t) = t + 2 and g{t) = t^-2t-Z. Find (i) (f,g) and (ii) ||/||. 

(i) {f,9) = r (t + 2W-2t-S)dt = t*/4 - 7t2/2 - 6tT = -37/4 

(ii) </,/> = r (t + 2)(t + 2)dt = 19/3 and ||/i| = y/Wj) = VW3 

•'n 



CHAP. 13] INNER PRODUCT SPACES 293 

13.6. Prove Theorem 13.1 (Cauchy-Schwarz): \{u,v}\ ^ \\u\\ \\v\\. 

If V = 0, the inequality reduces to — an d hen ce is valid. Now suppose v # 0. Using 
zz = |z|2 (for any complex number z) and {v,u) = {u,v}, we expand ||m — <m, v)tv||2 ^ where t 

is any real value: 

— ||m — <M, i;>ti;||2 = (u — {u,v)tv, u — (u,v)tv) 

= (u, u) — {u, v)t(u, v) — (u, v)t(v, u) + (u, V)(U, V)t^V, V) 
= ||m|P - 2t\{u,v)\^ + |(M,i;)|2t2||v||2 

\(u v)\'^ 
Set t = l/\\v\\^ to find 0^ ||m|P- . ' , from which ](m,i>>P ^ ||m||2 H-yH^. Taking the square 

root of both sides, we obtain the required inequality. 

13.7. Prove that the norm in an inner product space satisfies the following axioms: 
[2Vi]: \\v\\ ^ 0; and \\v\\ = if and only if v = 0. 

[N,\. ]H| = |fc||H|. 

[N,]: \\u + v\\^\\u\\ + \\v\\. 

By [/g], (v,v) — 0; hence \\v\\ = ^/l^v/v) — 0. Furthermore, ||i;|| = if and only if {v,v) = Q, 
and this holds if and only if v = Q. Thus [Wj] is valid. 

We find \\kv\\^ = {kv, kv) = kk(v,v) = |fc|2 ||-y||2. Taking the square root of both sides gives [ATg]. 
Using the Cauchy-Schwarz inequality, we obtain 

I |m + 0)112 = {u + v,u + v) — {u,u) + {u,v) + (u,v) + {v,v) 

^ ||«|p + 2|M||HI + IMP = (IMI + IMIP 

Taking the square root of both sides yields [A^s]- 

Remark: [Ng] is frequently called the triangle inequality because , 

if we view m + -u as the side of the triangle formed with u and v 
(as illustrated on the right), then [Ng\ states that the length of one 
side of a triangle is less than or equal to the sum of the lengths 
of the other two sides. 

ORTHOGONALITY 

13.8. Show that if u is orthogonal to v, then every scalar multiple of u is also orthogonal 
to V. Find a unit vector orthogonal to Vi = (1, 1, 2) and V2 = (0, 1, 3) in R^. 

If {u, V) = then {ku, v) = k{u, v) = fc • = 0, as required. Let w = {x, y, z). We want 

= {w,v{) = X + y + 2z and = (w, V2) = y + 3z 
Thus we obtain the homogeneous system 

X + y + 2z = 0, y + Sz = 
Set z = 1 to find y = —S and x = 1; then w = (1,-3,1). Normalize w to obtain the required 
unit vector w' orthogonal to v^ and ^2- W = w/||w|| = (l/yfTl, — 3/-\/ll, l/-v/li). 

13.9. Let W be the subspace of R^ spanned by u = (1, 2, 3, -1, 2) and v = (2, 4, 7, 2, -1). 
Find a basis of the orthogonal complement W^ of W. 

We seek all vectors w = (x, y, z, s, t) such that 

{w,u) = X + 2y + Sz - s + 2t = 
(w,v) = 2x + 4:y + 7z + 2s - t = 
Eliminating x from the second equation, we find the equivalent system 

X + 2y + 3z - s + 2t = 
z + 4s ~ 5t = 

The free variables are y, s and t. Set y = —1, s = 0, t = to obtain the solution Wi = (2, —1, 0, 0, 0). 
Set y = 0, s-1, t = to find the solution Wj = (13,0,-4,1,0). Set y-Q, 8 = 0, t = l to obtain 
the solution W3 = (—17, 0, 5, 0, 1). The set {wj, W2. Ws) is a basis of W . 




294 INNER PRODUCT SPACES [CHAP. 13 

13.10. Find an orthonormal basis of the subspace W of C^ spanned by Vi — (1, i, 0) and 
t;2 = (l,2,l-i). 

Apply the Gram-Schmidt orthogonalization process. First normalize Vj. We find 

\M? = (.'"k'^i) = I'l + i'(-i) + 0-0 = 2 and so ||vi|| = V2 

Thus Ml = vi/\\vi\\ = (l/V2,i/y/2,Q). 

To form W2 = Vj — (v^, Wi>Mi, first compute 

<i'2,wi> = <(1,2, l-«), (l/\/2,t/\/2, 0)> = l/V2-2i/V2 = (l-2i)/V2 

Then ., = (i,2.1_^)--^^__,0j - ^_^,^,l-^ 

Next normalize W2 or, equivalently, 2w2 = (1 + 2i, 2 — i, 2 — 2i). We have 

||2m)i||2 = (2m;i,2wi) = (1 + 2t)(l - 2i) + (2 - 1)(2 + 1) + (2 - 2i)(2 + 2i) = 18 
and ||2wi|l = \/l8. Thus the required orthonormal basis of W is 

/ 1 i \ 2wi /l + 2i 2-t 2 -2i 



^2 yT^ r ' ii2»iii vig'^'vn 



13.11. Prove Lemma 13.3: An orthonormal set {t*i, ...,«,} is linearly independent and, for 
any v e V, the vector 

yj = V — {V, Ui)Ui — {V, U^Ut — • ■ ■ — {V, Ur)Ur 

is orthogonal to each of the im. 

Suppose aiMi + • • • + a^u^ = 0. Taking the inner product of both sides with respect to Mi, 

= <0, Mi> = (aiMi + • ■ • + a^u^, Ml) 

= ai(Mi,Mi> + a2<M2,Mi> + ■•• + 0^<M„Mi> 

= Oi • 1 + a2 • + • • ■ + ftr ' = O-i 
or Oi = 0. Similarly, for i = 2, . . . , r, 

= (0, Mj) = <aiMi + • • • + a^M„ Mj) 

= ai<Mi, Mj> + • ■ • + Oi<Mi, Mj) + • • • + ar(Mr. «{> = "i 

Accordingly, {mi, . . . , Uj) is linearly independent. 

It remains to show that w is orthogonal to each of the Mj. Taking the inner product of w with 
respect to ttj, 

<«;,Ml> = <V,Ml> - <'y,Mi><Mi,Mi> - <t', M2KM2, Mi> - •■• - (V,M^>(M„Mi> 
= {V, Ml) - (V, Ml) • 1 - <V, M2) • - • • • - <V, M^> • = 

That is, w is orthogonal to Mi. Similarly, for i = 2, . . . ,r, 

<W,Mi) = <'U,Mi) - <'U,Ml)<Mi,Mi) - • • • - (■U,Mi)(Mi,Mj) - • • • - (•«, M^XMr, Mj) = 

Thus w is orthogonal to Mj for i = 1 r, as claimed. 

13.12. Let TF be a subspace of an inner product space V. Show that there is an orthonormal 
basis of W which is part of an orthonormal basis of V. 

We choose a basis {v^, ...yV^ioiW and extend it to a basis {vj, ...,v^} of V. We then apply 
the Gram-Schmidt orthogonalization process to {vi,...,v^} to obtain an orthonormal basis 
{Ml M„} of y where, for i = 1, . . . , w, M; = ai^v^ + •■■ + au^i. Thus 'u^,...,Ur^W and there- 
fore {mi, . . . , mJ is an orthonormal basis of W. 



CHAP. 13] INNER PRODUCT SPACES 295 

13.13. Prove Theorem 13.2: Let W' be a subspace of V; then V=W@W-^. 

By Problem 13.12 there exists an orthonormal basis {ui, . . ., u^} of W which is part of an ortho- 
normal basis {mi, . . .,m„} of V. Since {mj, . . .,«„} is orthonormal, u^+i, ...,«„£ TF"*-. If v e.V, 

V = OjMj + • • ■ + ft„M„ where ajV-i + ■ ■ • + a^u^ G W, (ir+i«*r + i + ' " • + «««« ^ ^ 

Accordingly, y = W + W"*-. 

On the other hand, if wGWnW-^, then <w,w> = 0. This yields w = 0; hence WnW''- = {0}. 

The two conditions, V =W+W^ and PTn W'"'- = {0}, give the desired result V =W ®W^. 

Note that we have proved the theorem only for the case that V has finite dimension; we remark 
that the theorem also holds for spaces of arbitrary dimension. 



13.14. Let W' be a subspace of W. Show that WcW^^-^, and that W = W^-^ when V 
has finite dimension. 

Let weW. Then {w,v) = for every vGW^; hence wSTV^-^. Accordingly, WcW-^-^. 
Now suppose V has finite dimension. By Theorem 13.2, V — W ® W'^ and, also, V = 
W^ ®W-^-^ . Hence 

dim W = dim y - dim W*" and dim TF"^ "^ = dim y - dim W"^ 
This yields dim TF = dimiy-'--'-. But WcW'^-'- by the above; hence W = W-^-^, as required. 



13.15. Let {ei, ...,€„} be an orthonormal basis of V. Prove: 

(i) for any uGV, u = {u, ei)ei 4- (u, 62)62 +•••+(«, en>e„; 

(ii) (ttifii + • • • + OnBu, biCi + • • • + 6„e„) = aibi + chbl + • • • + Onhn; 



(iii) for any u,v GV, (u, v) = {u, ei){v, ei) + • • • + (u, e„)<v, e„>; 

(iv) if T-.V^V is linear, then (r(ej), ei) is the i/-entry of the matrix A representing 
T in the given basis {d}. 

(i) Suppose at = /f 1 «! + fcj 62 + • • ■ + fen^n- Taking the inner product of u with ej, 

= fci<ei, ei> + fc2<e2, ej) + • • • + fc„(e„, ej} 
= fcj • 1 + fcj • + • • • + fe„ • = fci 
Similarly, for i = 2, . . .,n, 

{u, gj) = (fciCi + h fcjej H h fc„e„, ej) 

= kiie^, ej> + • • • + fci(ei, 6;) + • • • + fc„<e„, gj) 
= fci • + ■ • • + fci • 1 + • • • + fc„ • = fej 
Substituting <m, ej) for fej in the equation m = fcie, + • • • + fc„e„, we obtain the desired result. 

(ii) We have ( 2 ajee, 2 6jeA = 2 ai6;<ei, e^} 

\i=l i=l / i.i = l 

But (ej, e^) = for i 7^ j, and (Cj, e^) = 1 for i = j; hence, as required. 



2 ajej, 2 fcjBj ) = 2 O'^i = aibi + ajbz + • • • + a„6„ 

i=l j = l / 1=1 

(iii) By (i), u = (m, e,>ei +•••+(«, e„>e„ and i^ = {v, ei>ei + • • • + (i), e„>e„ 



Then by (ii), {u, v) = (m, e^Xv, e,) + (u, CzKv, 62) + ■ ■ ■ + {u, e„){v, e„> 



296 INNER PRODUCT SPACES 

(iv) By(i), 

r(e,) = <r(ei), ei>ei + {T(e,), e^je^ + ■■■ + {T{e,), e„>e„ 



[CHAP. 13 



T{e„) = iT(e„), e,)ei + {T(e„), 6^)6^ + • • • + (T{e^), e„)e„ 

The matrix A representing T in the basis {e;} is the transpose of the above matrix of 
efficients; hence the v-entry of A is (T(ej), ej. 

ADJOINTS 

13.16. Let T be the linear operator on C^ defined by 

T{x, y, z) = (2x + (1 - i)y, (3 + 2i)x - 4iz, 2ix + (4 - Zi)y - Zz) 

¥mdiT*{x,y,z). 

First find the matrix A representing T in the usual basis of C^ (see Problem 7.3): 

/ 2 X-i 

A = 3 + 2i -4i 
\ 2i 4 - 3i -3 

Form the conjugate transpose A* of A: 

/ 2 3 - 2i -2i 

A* = 1 + i 4 + 3t 

\ 4i -3 

Thus 

T*(x, y, z) = {2x + (3 - 2i)y - 2iz, (1 + i)x + (4 + 3i)z, Aiy - 3z) 

13.17. Prove Theorem 13.5: Let ^ be a linear functional on a finite dimensional inner 
product space V. Then there exists a unique % G F such that ^(v) = {v, u) for 
every v G.V. 

Let {ei, . . . , e„} be an orthonormal basis of V. Set 

u = 0(ei)ei + 0(62)62 + • • • + 0(e„)e„ 

Let M be the linear functional on V defined by u(v) = (v, u), for every v ElV. Then for i = 1, . . . , to, 

w(ei) = (ej.M) = ^ej, 0(e7)ei + • • • + 0(ije„> = <p(e,) 

Since m and agree on each basis vector, u = <i>. 

Now suppose m' is another vector in V for which <i,(v) = (v, u') for every vGV. Then 
(V, u) = (V, u') or (V, u-u') = 0. In particular this is true for v = u-u' and so (u ~u',u- u') = 0. 
This yields u — u' = and u = u'. Thus such a vector m is unique as claimed. 

13.18. Prove Theorem 13.6: Let T be a linear operator on a finite dimensional inner product 
space V. Then there exists a unique linear operator T* on V such that {T{u), v) = 
{u, T* (v)), for every u,v GV. Moreover, if A is the matrix representing T in an 
orthonormal basis {ei} of V, then the conjugate transpose A* of A is the matrix rep- 
resenting T* in {Ci}. 

We first define the mapping T*. Let v be an arbitrary but fixed element of V. The map 
u h» (T(u), V) is a linear functional on V. Hence by Theorem 13.5 there exists a unique element 
v'&V such that {T{u),v) = {u,v'} for every u&V. We define T*V-^V by T*(v) = v'. Then 
(T(u), v) = {u, T* (v)) for every u.v&V. 



CHAP. 13] INNER PRODUCT SPACES 297 

We next show that T* is linear. For any u, V; G V, and any a,b G K, 

(u, T*(av^ + hv^)) = {T(u), av^ + bvz) = d{T(u), v^) + b{T{u), v^) 

= a(u, r*(vi)) + b{u, T*(V2)} = (u, aT*(Vi) + bT*{v2)) 

But this is true for every uGV; hence T* {av^ + bv^) = aT*(vi) + bT*(v2). Thus T* is linear. 
By Problem 13.15(iv), the matrices A = (ay) and B = (6y) representing T and T* respectively 
in the basis {ej are given by Oy = (r(ej), e;) and by = <r*(ej), e^). Hence 



6y = <r*(e,.), ej) = {ei, r*(e^)> = {T{ei), e,> = o^i 
Thus B = A*, as claimed. 

13.19. Prove Theorem 13.7: Let S and T be linear operators on a finite dimensional inner 
product space V and let k G K. Then: 

(i) (5 + 7)* = 5*4-2^* (iii) (ST)* = T*S* 

(ii) (/cT)* = kT* (iv) (T*)* = T 

(i) For any u,v G V, 

{{S + T){u), V) = (S(m) + T{u), V) = {S{u), V) + {T(u), v) = (u, S*{v)> + (u, T*{v)) 

= {u, S*{v) + T*(v)) = (u, (S* + T*){v)} 
The uniqueness of the adjoint implies (S + T)* = S* + T*. 

(ii) For any u,v G V, 

{(kT){u), V) = (kT{u), V) = k(T(u), v) = k{u, T*{v)) = (u, kT*iv)) = {u, (kT*)(v)) 

The uniqueness of the adjoint implies {kT)* = kT*. 

(iii) For any u,v G V, 

{(ST)(u),v) = {S(T{u)),v} = {T(u),S*{v)) ^ {u, T*(S*(v))) = (m, (r*S*)(i;)> 
The uniqueness of the adjoint implies (ST)* = T*S*. 



(iv) For any u,vGV, {T*(u),v) = {v, T*{u)) = {T(v),u) = {u, T(v)) 

The uniqueness of the adjoint implies (T*)* = T. 

13.20. Show that: (i) /* = 7; (ii) 0* = 0; (iii) if T is invertible, then (T-i)* = T*-\ 
(1) For every u,vGV, {I(u),v} = {u,v} = {u,I{v)}; hence I* = I. 

(ii) For every u,vGV, <0(M),'y) = {0,v) = = (u,0) - {u,0(v)); hence 0* = 0. 
(iii) 7 = /* = (TT-i)* = (r-i)*r*; hence {T'^)* = T*-K 

13.21. Let r be a linear operator on V, and let W he a T-invariant subspace of V. Show 
that W is invariant under T*. 

Let uGW^. If wGW, then T{w) G W and so (w, T*(u)) = (Tiw), u) - 0. Thus T*(ii) GW^ 
since it is orthogonal to every w GW . Hence W is invariant under T*. 

13.22. Let r be a linear operator on V. Show that each of the following conditions implies 
r = 0: 

(i) (r(M), i;) = for every u,v GY; 

(ii) F is a complex space, and (r(M),«) = for every m G F; 

(iii) T is self -adjoint and (r(tt),%) = for every « G F. 

Give an example of an operator T on a real space V for which (r(w), m> = for 
every m e F but T ^ 0. 

(i) Set V - T{u). Then {T{u), T{u)} = and hence T(u) = 0, for every uGV. Accordingly, 

r = o. 



298 INNER PRODUCT SPACES [CHAP. 13 

(ii) By hypothesis, (T(v + w), v + w) = for any v.wGiV. Expanding and setting {T{v),v} = 
and (T{w), w) = 0, 

{T(v), w) + (T(w), V) = Q (1) 

Note w is arbitrary in (1). Substituting iw for w, and using {T(v), iw) = i(T{,v), w) = 
-i(T(v),w) and {T{iw),v) - {iT(w),v) = i{T(w),v), 

-i{T(v), w) + i{T(w), V) = a 

Dividing through by i and adding to (1), we obtain {T(w), v) = Q for any v.wGV. By (1) 
T -Q. 

(iii) By (ii), the result holds for the complex case; hence we need only consider the real case. 
Expanding {T(v + w),v-\-w) = 0, we again obtain (1). Since T is self-adjoint and since it is 
a real space, we have (T(w),v) = (w,T(v)) = {T(v),w). Substituting this into (1), we obtain 
{T(v), w) = Q for any v,w&V. By (i), T - Q. 

For our example, consider the linear operator T on R2 defined by T(x, y) = (y, ~x). Then 
{T{u), u) = for every m G V, but T ¥^ 0. 

ORTHOGONAL AND UNITARY OPERATORS AND MATRICES 

13,23. Prove Theorem 13.9: The following conditions on an operator U are equivalent: 

(i)C/* = C/-i; (ii) {Uiv),Uiw)} = {v,w}, for every v.wGV; (iii) ||C7(v)|| = ||t;||, for 
every v &V. 

Suppose (i) holds. Then, for every v,w &V, 

{Uiv),U(w)) = {v,U*U{w)) = {v,I(w)) - {v,vo) 
Thus (i) implies (ii). Now if (ii) holds, then 



||C7(i;)|| = V<t/(v), V(.v)) = V(^> = \\v\\ 

Hence (ii) implies (iii). It remains to show that (iii) implies (i). 
Suppose (iii) holds. Then for every i; S V, 

(U*U{v), V) = {U{v), V(v)) = {V, v) = {I{v), V) 

Hence ((U*V - I)(v),v) = for every ve.V. But t7*t7 -/ is self -adjoint (Prove!); then by Prob- 
lem 13.22 we have V*U-I-Q and so U*U = I. Thus 17* = t7-i as claimed. 

13.24. Let C7 be a unitary (orthogonal) operator on V, and let W be a subspace invariant 
under U. Show that W^ is also invariant under U. 

Since U is nonsingular, U{W) — W; that is, for any w G W there exists w' G TV such that 
U{w') — w. Now let V G W^. Then for any w &W, 

(U(y), w) = (U(v), V(w')) - {V, w') = 
Thus U{v) belongs to W . Therefore W is invariant under U. 

13.25. Let A be a matrix with rows Ri and columns d. Show that: (i) the i;-entry of A A* 
is {Ri, Rj); (ii) the y-entry of A* A is {d, d). 

If A = (ftjj), then A* = (6y) where 6y = a^. Thus AA* = (cy) where 

n n 

fc— 1 K — 1 

= <(«(!, . . . , ai„), {aji, ..., UjJ) = (Ri, Rj) 
as required. Also, A*A = (dy) where 

n n 

<iii = 2 &ifc«icj = 2 «kj»fci = o-iflii + »2j^ -!-•■■+ a„M^ 
k = l fc = l 

= ((«ij, . . . , a^j), (an, . . . , a„i)) = {Cj, C-) 



CHAP. 13] INNER PRODUCT SPACES 299 

13.26. Prove Theorem 13.11: The following conditions for a matrix A are equivalent: 
(i) A is unitary (orthogonal), (ii) The rows of A form an orthonormal set. (iii) The 
columns of A form an orthonormal set. 

Let fij and Cj denote the rows and columns of A, respectively. By the preceding problem, 
AA* = (cy) where Cy = (Ri,«j>. Thus AA* = I if and only if <fli,fij) = Sy. That is, (i) is 
equivalent to (ii). 

Also, by the preceding problem. A* A = (dy) where dy - (Cj, Q. Thus A* A = / if and only 
if (Cj, Cj) = 8y. That is, (i) is equivalent to (iii). 

Remark: Since (ii) and (iii) are equivalent, A is unitary (orthogonal) if and only if the transpose 
of A is unitary (orthogonal). 

13.27. Find an orthogonal matrix A whose first row is Ui = (1/3, 2/3, 2/3). 

First find a nonzero vector W2 = {x, y, z) which is orthogonal to Mj, i.e. for which 
= (Ml, W2> = x/3 + 2y/B + 2z/3 = or x + 2y + 2z = 
One such solution is Wg — (0, 1, —1). Normalize W2 to obtain the second row of A, i.e. 
M2=(0,l/vi,-l/\/2). 

Next find a nonzero vector Wg — {x, y, z) which is orthogonal to both Mj and «£, i.e. for which 
= <Mi, W3) = xlZ + 2ylZ + 2«/3 =0 or x -\- 2y + 2z = Q 
— (U2, Wz) — Vlyf2 — zlyl2 = or 2/— z = 

Set « = — 1 and find the solution Wj = (4, —1, —1). Normalize W3 and obtain the third row of A, 
i.e. M3 = (4/Vl8, -l/VlS, -l/\/l8). Thus 

/ 1/3 2/3 2/3 

A = 1/a/2 -1/\/2 

\4/3\/2 -l/3\/2 -l/3\/2y 

We emphasize that the above matrix A is not unique. 

13.28. Prove Theorem 13.12: Let {ei, . . . , e„} be an orthonormal basis of an inner product 
space V. Then the transition matrix from {Ci} into another orthonormal basis is 
unitary (orthogonal). Conversely, if P = (ao) is a unitary (orthogonal) matrix, then 
the following is an orthonormal basis: 

{e'i = auBi + 02162 + • • • + a„ie„ : i = 1, . . . , w} 

Suppose {/j} is another orthonormal basis and suppose 

/i = ^ilBl + 61262 +•••+ fein^n. 1=1, ...,n U) 

By Problem 13.15 and since {/j} is orthonormal, 

Sy = (fufj) = biibfi + 6426^ + • • • + bi„i;~ (2) 

Let B = (6y) be the matrix of coefiicients in (1). (Then B* is the transition matrix from {ej} to 

{/J.) By Problem 13.25, BB* = (cy) where Cy = 6ii67i + ^12^ + • • • + K^n- By (2). "n = ^H 
and therefore BB* = /. Accordingly B, and hence B*, are unitary. 

It remains to prove that {«,'} is orthonormal. By Problem 13.15, 

(ei e'j) = auo^ + a^^j + • • • + a„ia;;j = (Cj, Cj) 

where Cj denotes the ith column of the unitary (orthogonal) matrix P = (ay). By Theorem 13.11, 
the columns of P are orthonormal; hence (e[, ej) = <C{, Cj) = 8y. Thus {e[} is an orthonormal basis. 

13.29. Suppose A is orthogonal. Show that det(A) = 1 or —1. 

Since A is orthogonal, A A* = /. Using \A\ = \A*\, 

1 = 1/| = lAAt| = \A\\At\ = |Ap 
Therefore lAI = 1 or — 1. 



300 INNER PRODUCT SPACES [CHAP. 13 

13.30. Show that every 2 by 2 orthogonal matrix A for which det{A) = 1 is of the form 

/cos 6 ~ sin 9\ ^ , , 

. „ „ for some real number 6. 

y sm 6 cos 9 j 

/a b\ 
Suppose A = { ) . Since A is orthogonal, its rows form an orthonormal set; hence 



^c d J 

„2 + 62 = x^ c2 + d2 = 1, ac+hd = 0, ad - be = 1 
The last equation follows from det(A) = 1. We consider separately the cases a. = and a¥'0. 

If a = 0, the first equation gives 6^ = 1 and therefore b = ±1. Then the fourth equation 
gives c = —b = ^:l, and the second equation yields 1 + d^ = i or d = 0. Thus 

^ = (-: I) " c I 

The first alternate has the required form with e — — r/2, and the second alternate has the required 
form with e = ttI2. 

If a 7^ 0, the third equation can be solved to give c =^ —bd/a. Substituting this into the 
second equation, 

62d2/a2 + (£2 =: 1 or bU^ + a2d2 = a2 or (62 + aP')d2 = a^ or a2 = d2 

and therefore a = d or a — —d. If a = —d, then the third equation yields c = b and so the 
fourth equation gives —a^ — c2 = 1 which is impossible. Thus a = d. But then the third equa- 
tion gives b — —c and so 

/a — c^ 
A = 



Since a^ + c^ = 1, there is a real number 9 such that a = cos e, c — sin « and hence A has the 
required form in this case also. 



SYMMETRIC OPERATORS AND CANONICAL FORMS IN EUCLIDEAN SPACES 

13.31. Let r be a symmetric operator. Show that: (i) the characteristic polynomial A{t) of 
r is a product of linear polynomials (over R); (ii) T has a nonzero eigenvector; 
(iii) eigenvectors of T belonging to distinct eigenvalues are orthogonal. 

(i) Let A be a matrix representing T relative to an orthonormal basis of V; then A — A*. Let 
A(t) be the characteristic polynomial of A. Viewing A as a complex self -adjoint operator, A 
has only real eigenvalues by Theorem 13.8. Thus 

A(«) = (i-Xi)(t-X2)---(«-X„) 

where the Xj are all real. In other words, A(t) is a product of linear polynomials over B. 

(ii) By (i), T has at least one (real) eigenvalue. Hence T has a nonzero eigenvector. 

(iii) Suppose T(v) = \v and T(w) = nw where \ ¥= /i. We show that X('U, w) = ii{v, w): 

\{v,w} = {\v,w} = {T{v),w) = {v,T{w)) = {v,nw) = ti{v,w) 

But \¥' n\ hence (v, w) = as claimed. 



13.32. Prove Theorem 18.14: Let T be a symmetric operator on a real inner product space 
V. Then there exists an orthonormal basis of V consisting of eigenvectors of T; 
that is, T can be represented by a diagonal matrix relative to an orthonormal basis. 

The proof is by induction on the dimension of V. If dimV = 1, the theorem trivially holds. 
Now suppose dim F = n > 1. By the preceding problem, there exists a nonzero eigenvector v^ of 
T. Let W be the space spanned by v-^, and let Mj be a unit vector in W, e.g. let Mj = i'i/||vi||. 



CHAP. 13] INNER PRODUCT SPACES 301 

Since v^ is an eigenvector of T, the subspace TF of y is invariant under T. By Problem 13.21, 
W^ is invariant under T* = T. Thus the restriction T of T to W^ is a symmetric operator. By 
Theorem 13.2, V =W ®W^. Hence dim TF"*- = m - 1 since dim W^ = 1. By induction, there 
exists an orthonormal basis {u^, . . . , m„} of W^ consisting of eigenvectors of T and hence of T. But 
<Mi, Wj> = for t = 2, . . . , m because Mj G PF-*- . Accordingly {%, %,...,«„} is an orthonormal set 
and consists of eigenvectors of T. Thus the theorem is proved. 



13^3. Let A = ( 2 ^ ) . Find a (real) orthogonal matrix P for which P^AP is diagonal. 



The characteristic polynomial A(t) of A is 



A(t) = |f/-A| = 



t- 1 -2 
-2 t- 1 



= {2 - 2t - 3 = (t - 3)(t + 1) 



and thus the eigenvalues of A are 3 and —1. Substitute t = 3 into the matrix tl — A to obtain the 
corresponding homogeneous system of linear equations 

2x-2y - 0, -2x + 2j/ = 
A nonzero solution is Vi — (1, 1). Normalize v^ to find the unit solution Mi = (ll\2, l/v2). 

Next substitute t — —l into the matrix tl — A to obtain the corresponding homogeneous system 
of linear equations 

-2x - 2j/ = 0, -2a; - 2^/ = 

A nonzero solution is v^ = (1, —1). Normalize V2 to find the unit solution u^ = (1/a/2, —1/^/2). 
Finally let P be the matrix whose columns are Mj and Mj respectively; then 

As expected, the diagonal entries of P*AP are the eigenvalues of A. 



13.34. Let A — \1 2 1 . Find a (real) orthogonal matrix P for which P'^AP is diagonal. 

\l 1 2/ 

First find the characteristic polynomial A{t) of A: 

t - 2 -1 -1 
-1 t - 2 -1 
-1 -1 t - 2 



A(t) = \tI-A\ = 



= (t-l)2(t-4) 



Thus the eigenvalues of A are 1 (with multiplicity two) and 4 (with multiplicity one). Substitute 
t = 1 into the matrix tl — A to obtain the corresponding homogeneous system 

—X — J/ — 2 = 0, —X — y — z = 0, —X — y — z = 

That is, X + y + z = 0. The system has two independent solutions. One such solution is Vi = 
(1, —1, 0). We seek a second solution V2 = (a, 6, c) which is also orthogonal to v^; that is, such that 

a+ b + c = and also a — 6 = 

For example, Vj = (1, 1, —2). Next we normalize f j and V2 to obtain the unit orthogonal solutions 

Ml = (l/\/2, -I/V2, 0), Ma = (l/\/6, l/\/6, -2/V^) 

Now substitute t = 4 into the matrix tl — A to find the corresponding homogeneous system 

2x — y — z = 0, -X + 2y - z = 0, -x - y + 2z = 

Find a nonzero solution such as t^s = (1, 1, 1), and normalize v^ to obtain the unit solution 
M3 = (l/v3> l/v3, l/vS). Finally, if P is the matrix whose columns are the Wj respectively. 



302 



INNER PRODUCT SPACES 



[CHAP. 13 



i/\/2 i/Ve i/VsX 

P = I -I/V2 l/Ve l/Vs and PtAP 

-2/\/6 I/V3/ 




13.35. Find an orthogonal change of coordinates which diagonalizes the real quadratic form 

q(x, y) = 2x^ + 2xy + 2y^. 

First find the symmetric matrix A representing q and then its characteristic polynomial A(t): 
'2 1- 



A = 



1 2 



and A{t) = 1*7- A| = 



t- 2 -1 
-1 t - 2 



{t-l){t-3) 



The eigenvalues of A are 1 and 3; hence the diagonal form of q is 

q(x', y') = x'^ + Zy'^ 

We find the corresponding transformation of coordinates by obtaining a corresponding orthonormal 
set of eigenvectors of A. 

Set f = 1 into the matrix tl — A to obtain the corresponding homogeneous system 

—X — y = 0, —X — y — 

A nonzero solution is v^ = (1,-1). Now set i = 3 into the matrix tl — A to find the corresponding 
homogeneous system 

X — y — 0, —X + y = 

A nonzero solution is V2 = (1, 1). As expected by Problem 13.31, v^ and V2 are orthogonal. Normalize 
Vi and V2 to obtain the orthonormal basis 

{ui = (l/\/2, -l/\/2), M2 = (l/\/2, I/V2)} 

The transition matrix P and the required transformation of coordinates follow: 



P = 



l/^/2 l/\/2 
-l/\/2 l/\/2 



and 



= P 



(x' + y')/V2 
(-x' + y')/^/2 



Note that the columns of P are Mj and 1*2- We can also express x' and y' in terms of x and j/ by 
using P^i = P'; that is, 

x' = {x-y)/V2, y' = (« + j/)/\/2 



13.36. Prove Theorem 13.15: Let T be an orthogonal operator on a real inner product space 
V. Then there is an orthonormal basis with respect to which T has the following 
form: 



1 I 

-_j 

I -1 



-1 



-1 



^ 1 

1 cos Oi — sm di I 



I sin di cos 01 



I cos 9r — sm 9r 
I sin 9r cos 6r 



CHAP. 13] INNER PRODUCT SPACES 303 

Let S = r + r-i = T + T*. Then S* = (T + T*)* = T* + T = S. Thus S Is a symmetric 
operator on V. By Theorem 13.14, there exists an orthonormal basis of V consisting of eigenvectors 
of S. If Xi, . . . , Xjn denote the distinct eigenvalues of S, then V can be decomposed into the direct 
sum y = Vi Va © • • • © Vm where the Vj consists of the eigenvectors of S belonging to Xj. We 
claim that each Vj is invariant under T. For suppose v e Vj; then S{v) — \v and 

S(T(v)) = (T+T-^)T(v) = T(T+T-^){v) = TS{v) = TiXfV) = \iT(v) 

That is, T{v) & Fj. Hence Vi is invariant under T. Since the V; are orthogonal to each other, we 
can restrict our investigation to the way that T acts on each individual 'V^. 

On a given V;, (T + T-^)v = S(v) = \v. Multiplying by T, 

(T2-\T + I){v) = 

We consider the cases Xj = ±2 and X; ¥= ±2 separately. If Xj = ±2, then (T ± I)Hv) - which 
leads to (T ± I){v) = or T(v) = ±v. Thus T restricted to this Fj is either I or -/. 

If Xj ¥= ±2, then T has no eigenvectors in Vj since by Theorem 13.8 the only eigenvalues of T 
are 1 or —1. Accordingly, for v ¥= the vectors v and T{v) are linearly independent. Let W be 
the subspace spanned by v and T(v). Then W is invariant under T, since 

T(T(v)) = T^v) = \^T(v) - v 

By Theorem 13.2, Vj = W © W-^ . Furthermore, by Problem 13.24 w'' is also invariant under T. 
Thus we can decompose V^ into the direct sum of two dimensional subspaces Wj where the Wj are 
orthogonal to each other and each Wj is invariant under T. Thus we can now restrict our investiga- 
tion to the way T acts on each individual Wj. 

Since T^ — XjT + / = 0, the characteristic polynomial A(t) of T acting on Wj is A(t) = 
t^ — \t + 1. Thus the determinant of T is 1, the constant term in A(t). By Problem 13.30, the 
matrix A representing T acting on Wj relative to any orthonormal basis of Wj must be of the form 

'cos e — sin e^ 
^ sin e cos e y 

The union of the basis of the Wj gives an orthonormal basis of Vj, and the union of the basis of the 
Vj gives an orthonormal basis of V in which the matrix representing T is of the desired form. 



NORMAL OPERATORS AND CANONICAL FORMS IN UNITARY SPACES 

13.37. Determine which matrix is normal: (i) A = / ^ *■ ) , (ii) B = I , 2 + • 

<» - = (;:)(-;:) = (-:;) -- g OG o = (-' ^ 

Since AA* ¥= A*A, the matrix A is not normal. 

^"' 1^1 2 + iJ\-i 2-iJ [2-2i 6 

\-i 2-iJ\l 2 + iJ \2-2i 6 
Since BB* = B*B, the matrix B is normal. 

13.38. Let T be a normal operator. Prove: 

(i) Tiv) = if and only if T*{v) = 0. 

(ii) T — \I is normal. 

(iii) If T{v) = \v, then T*{v) = Xv; hence any eigenvector of T is also an eigen- 
vector of T*. 

(iv) If T{v) = Xiv and T{w) = X2W where A,i ^ A2, then {v,w) = 0; that is, eigen- 
vectors of T belonging to distinct eigenvalues are orthonormal. 



304 INNER PRODUCT SPACES [CHAP. 13 

(i) We show that {T(v), T{v)) = {T*(v), T*iv)): 

(T(v), T(v)) = (V, T*T{v)) = {V, TTHv)) = {T-{v), T'-(v)) 
Hence by [/g], T(v) = if and only if T*(v) = 0. 

(ii) We show that T — \I commutes with its adjoint: 

{T - \i)(,T - \i)* = (r-x7)(r*-x/) = rr* - XT* - xr + xx/ 

_ T*T -XT - XT* + XXI = {T* -Xl){T - XI) 

= (T - XI)*{T - XI) 
Thus T ~\I is normal. 

(iii) If T(v) = XV, then (T - Xl){v) = 0. Now T - Xl is normal by (ii); therefore, by (i), 
(r-X/)*(i>) = 0. That is, [T* -Xl)(v) = 0; hence T*(v) -Xv. 

(iv) We show that Xi{v, w) = X2<v, w): 

Xi{v,w} = (Xiv.w) = {T(v),w) = {v,T*(w)) - {v.X^w) = X<i_{v,w) 
But Xj ¥= X2; hence {v, w) = 0. 



13.39. Prove Theorem 13.16: Let T be a normal operator on a complex finite dimensional 
inner product space V. Then there exists an orthonormal basis of V consisting of 
eigenvectors of T; that is, T can be represented by a diagonal matrix relative to an 
orthonormal basis. 

The proof is by induction on the dimension of V. If dim V = 1, then the theorem trivially 
holds. Now suppose dim V — n> \. Since V is a complex vector space, T has at least one eigen- 
value and hence a nonzero eigenvector v. Let W be the subspace of V spanned by v and let u^ be a 
unit vector in W. 

Since v is an eigenvector of T, the subspace W is invariant under T. However, v is also an 
eigenvector of T* by the preceding problem; hence W is also invariant under T*. By Problem 13.21, 
W is invariant under T** = T. The remainder of the proof is identical with the latter part of 
the proof of Theorem 13.14 (Problem 13.32). 



13.40. Prove Theorem 13.17: Let T be an arbitrary operator on a complex finite dimensional 
inner product space V. Then T can be represented by a triangular matrix relative 
to an orthonormal basis {Ui, U2, . . ., Wn} ; that is, for i = l, . . .,n, 

T{ui) — OiiMi + ai2U2 + • • • + aiiUi 

The proof is by induction on the dimension of V. If dim V = 1, then the theorem trivially 
holds. Now suppose dim V = n > 1. Since V is a complex vector space, T has at least one eigen- 
value and hence at least one nonzero eigenvector v. Let W be the subspace of V spanned by v and 
let Ml be a unit vector in W. Then itj is an eigenvector of T and, say, T{ui) = a^Ui. 

By Theorem 13.2, V =W ®W^. Let E denote the orthogonal projection of V into W''' . 
Clearly W is invariant under the operator ET. By induction, there exists an orthonormal basis 
{^2, . . . , M„} of W such that, for i = 2, . . .,n, 

ET{Ui) = OJ2M2 + ttiaMg + • • ■ + djiitj 

(Note that {ui,U2, ...,u„} is an orthonormal basis of V.) But E is the orthogonal projection of V 
onto W ; hence we must have 

T(Ui) = ajiMi + ai2M2 + • ■ • + fliiM; 

for i = 2,...,n. This with T(ui) = a^Ui gives us the desired result. 



CHAP. 13] INNER PRODUCT SPACES 305 

MISCELLANEOUS PROBLEMS 

13.41. Prove Theorem 13.13A: The following conditions on an operator P are equivalent: 
(i) P = T^ for some self -ad joint operator T. 

(ii) P — S*S for some operator S. 

(iii) P is self -adjoint and {P(u), u}^ for every u GV. 

Suppose (i) holds, that is, P = T^ where T = T*. Then P = TT = T*T and so (i) implies 
(ii). Now suppose (ii) holds. Then P* = (S*S)* = S*S** = S*S = P and so P is self-adjoint. 

Furthermore, 

{P(u),u} = {S*S(u),u) = (S(m), S(m)> ^ 

Thus (ii) implies (iii), and so it remains to prove that (iii) implies (i). 

Now suppose (iii) holds. Since P is self -adjoint, there exists an orthonormal basis {u^, . . ., m„} 
of V consisting of eigenvectors of P; say, P^.u^) = XiMj. By Theorem 13.8, the Xj are real. Using 
(iii), we show that the Xj are nonnegative. We have, for each i, 

- <P(Mi), Mj) = (XjMj, Mi) = \(Ui, Mi> 

Thus <Mi, Mj) - forces Xj - 0, as claimed. Accordingly, ^/\^ is a real number. Let T be the 
linear operator defined by 

r(Mj) = VXiMi, for t=l, ...,n 

Since T is represented by a real diagonal matrix relative to the orthonormal basis {u^, T is self- 
adjoint. Moreover, for each i, 

T^Ui) = r(vTiMi) = V\iT{ut) = V^^^/\^u^ = \{u.i = P(ud 

Since T^ and P agree on a basis of y, P = T^. Thus the theorem is proved. 

Remark: The above operator T is the unique positive operator such that P = T^ (Problem 
13.93); it is called the positive square root of P. 

13.42. Show that any operator T is the sum of a self-adjoint operator and skew-adjoint 
operator. 

Set S^^{T+T*) and U = ^(T-T*). Then T = S+U where 

S* = {^{T+T*))* := :^(T* + T**) = 1{T* + T) = S 

and U* = {^(T-T*))* =; ^(T* - T) = -^(T - T*) = -U 

i.e. S is self-adjoint and U is skew adjoint. 

13.43. Prove: Let T be an arbitrary linear operator on a finite dimensional inner product 
space V. Then T is a product of a unitary (orthogonal) operator U and a unique 
positive operator P, that is, T = UP. Furthermore, if T is invertible, then U is also 
uniquely determined. 

By Theorem 13.13, r*r is a positive operator and hence there exists a (unique) positive operator 
P such that P2 = T*T (Problem 13.93). Observe that 

||P(t;)||2 = {P{v),P(v)} = (PHv),v} = {T*T(v),v} = {T(v), T(v)) = \\T{v)\\2 (i) 

We now consider separately the cases when T is invertible and non-invertible. 

If T is invertible, then we set U = Pr-i. We show that U is unitary: 

U* - (Pr-i)* - T-i*P* = (r*)-ip and U*U = (T*)-^PPT-i = (T*)-^T*TT-^ = I 

Thus U is unitary. We next set U =U-K Then U is also unitary and T = UP as required. 

To prove uniqueness, we assume T = UqP^ where U^ is unitary and Pg is positive. Then 

T*T = PtutUaPo = PoIPo = ^o 

But the positive square root of T*T is unique (Problem 13.93); hence Pq = P. (Note that the 
invertibility of T is not used to prove the uniqueness of P.) Now if T is invertible, then P is also 
by (1). Multiplying U^P = UP on the right by P-i yields Uo = U. Thus U is also unique when 
T is invertible. 



306 INNER PRODUCT SPACES [CHAP. 13 

Now suppose T is not invertible. Let W be the image of P, i.e. W = ImP. We define 

Ui-.W-^V by 

Ui(w) - T(v) where P(v) = w (2) 

We must show that Ui is well defined, that is, that P(v) - P{v') implies T(v) = T(v'). This follows 
from the fact that P{v — v') = is equivalent to ||P(d — i;')li = which forces ||r(i) — v')l| = 
by (1). Thus U^ is well defined. We next define Ui-W^V. Note by (1) that P and T have the 
same kernels. Hence the images of P and T have the same dimension, i.e. dim (Im P) — dim W = 
dim(Imr). Consequently, W and (Im 7^-'- also have the same dimension. We let C/j be any 
isomorphism between W-^ and (Im T) . 

We next set U = UiQ U^. (Here U is defined as follows: it v GV and v — w + w' where 
weW.w'eW^, then U(v) = Ui(w) + U^iw').) Now U is linear (Problem 13.121) and, if v G V 

and P(v) — w, then by (2) 

T(v) = Ui(w) = U(w) = UPiv) 
Thus T = UP as required. 

It remains to show that U is unitary. Now every vector x G V can be written in the form 
X = P(v) + w' where w'GPT-^. Then U{x) = UPiv) + U2{w') = ^(^;) + ?72(w') where <7'(i;), 
UiiW)) = by definition of U^. Also, <r(i;), T{v)) - {P(v),P(v)) by (1). Thus 

<?7(a;), U(x)) = {T{v) + U^iW), T(v) + U^(w')) 
= {T(v), T(v)) + (Ui(w'), U^iw')) 
= {P(v), P{v)} + <w', w') = <P(i;) + w', P{v) + w') 
= (X, X) 
(We also used the fact that (P{v),w') = 0.) Thus U is unitary and the theorem is proved. 



13.44. Let (tti, a2, . . . ) and (6i, &2, . . . ) be any pair of points in i2-space of Example 13.5. 

00 

Show that the sum ^(hhi = aibi + azbz + • • • converges absolutely. 

By Problem 1.16 (Cauchy-Schwarz inequality), 



K6,l + ••• + KM ^ A 2 «i J 2 6? - \2«fA S^-i 

which holds for every n. Thus the (monotonic) sequence of sums S„ = l^i^il + • • • + la„6„| is 
bounded, and therefore converges. Hence the infinite sum converges absolutely. 



13.45. Let V be the vector space of polynomials over R with inner product defined by 

</, fl') = I f{t) g{t) dt. Give an example of a linear functional ^ on V for which 

Theorem 13.5 does not hold, i.e. there does not exist a polynomial h{t) for which 
<l>{f) = <f,h) for every feV. 

Let ^ : y -> R be defined by ^(/) = /(O), that is, ^ evaluates f(t) at and hence maps f{t) into 
its constant term. Suppose a polynomial h(t) exists for which 

-Pif) = m - C f{t)h(t)dt (1) 

for every polynomial f(t). Observe that ^ maps the polynomial tf(t) into 0; hence by (1), 

-1 

tf(t)h(t)dt = (2) 

'0 

for every polynomial f{t). In particular, (2) must hold for f(t) = th{t), that is. 






C.m^t)dt = 

This integral forces h(t) to be the zero polynomial; hence 4>(f) = (f.h) = </,0) = for every poly- 
nomial f(t). This contradicts the fact that <f> is not the zero functional; hence the polynomial h(t) 
does not exist. 



CHAP. 13] INNER PRODUCT SPACES 307 

Supplementary Problems 

INNER PRODUCTS 

13.46. Verify that 

(ajMj + a^u^, fei'Wi + h^^ = a^^ifi^, v{) + a.i62(Wi, v^) + a^-^{u^, v-^ + ^^^{u^, v^ 

More generally, prove that 



I m n \ 

\ i=X i=l / i,i 



13.47. Let u = (xi, x^) and v = (j/j, y^) belong to R2. 

(i) Verify that the following is an inner product on R^: 

f(u, V) = XiVi - 2XiV2 - ZxzVi + 5x2y2 

(ii) For what values of k is the following an inner product on R2? 

f(u, v) = XiVi - BxiVz - SXiVi + kx2y2 

(iii) For what values of a, 6, c,d £ R is the following an inner product on R^? 

f{u, v) - a^ij/i + 6*12/2 + ca;23/i + da;2J/2 

13.48. Find the norm of v = (1, 2) € R2 with respect to (i) the usual inner product, (ii) the inner 
product in Problem 13.47(i). 

13.49. Let u = (zi, Z2) and v = (w^, W2) belong to C^. 

(i) Verify that the following is an inner product on C^: 

f{u, V) = ZiWi + (1 + i)ZiW2 + (1 — i)22'"'l + SZ2W2 

(ii) For what values of a, b, c, d e C is the following an inner product on C^? 

f(u, v) = aziWi + bziW2 + CZ2W1 + dz2W2 

13.50. Find the norm of v = (l — 2i,2 + Si) G C^ with respect to (i) the usual inner product, (ii) the 
inner product in Problem 13.49(i). 

13.51. Show that the distance function d(u,v) = ||v — m||, where u,v&V, satisfies the following axiom 
of a metric space: 

[I>j] d{u,v) — 0; and d(u,v) = if and only if u = v. 

[D2] d(u, v) = d{v, u). 

[Dg] d{u, v) — d{u, w) + d{w, v). 

13.52. Verify the Parallelogram Law: Hm + ijII + ||M-'y|| = 2||m|| + 2||d||. 

13.53. Verify the following polar forms for (u, v): 

(i) {u,v) == l\\u + v\\2-l\\u-v\\2 (real case); 

(ii) (u,v} = J^||M + a'||2-|||M-i)||2+i||M + i'u||2-i||M-w||2 (complex case). 

13.54. Let V be the vector space of m X ti matrices over R. Show that {A,B) = tr(B'A) defines an inner 
product in V. 

13.55. Let V be the vector space of polynomials over R. Show that {f,g)= I f{t) g(t) dt defines an 
inner product in V. 

13.56. Find the norm of each of the following vectors: 

(i) u = (|,-Jt,^,^)GR4, 

(ii) V = (1 - 2t, 3 + i, 2 - 5i) G C^, 

(iii) /(() = t2 _ 2t + 3 in the space of Problem 13.55, 

(iv) A — [ ) in the space of Problem 13.54. 

V3 -4/ 



308 INNER PRODUCT SPACES [CHAP. 13 

13.57. Show that: (i) the sum of two inner products is an inner product; (ii) a positive multiple of an 
inner product is an inner product. 

13.58. Let a, 6, c e R be such that at^ + bt + c- for every t e R. Show that 62 _ 4^0 ^ 0. Use this 
result to prove the Cauchy-Schwarz inequality for real inner product spaces by expanding 

||tM + i;||2 ^ 0. 

13.59. Suppose |<m, •«>! = ||m|| H^yll. (That is, the Cauchy-Schwarz inequality reduces to an equality.) Show 
that u and v are linearly independent. 

13.60. Find the cosine of the angle e between u and v if: 
(i) u = (1, -3, 2), V = (2, 1, 5) in RS; 

(ii) u — 2t — l, V = t^ in the space of Problem 13.55; 

/2 1\ /O -1\ 

(ill) M=(_l,v = ( ) in the space of Problem 13.54. 

ORTHOGONALITY 

13.61. Find a basis of the subspace W of R* orthogonal to u^ = (1, —2,3,4) and % = (3. —5, 7, 8). 

13.62. Find an orthonormal basis for the subspace W of C^ spanned by Wj = (1, i, 1) and % = (1 + i, 0, 2). 

13.63. Let V be the vector space of polynomials over R of degree — 2 with inner product </, g) = 

Cf{t)g{t)dt. 

(i) Find a basis of the subspace W orthogonal to h{t) = 2t + 1. 

(ii) Apply the Gram-Schmidt orthogonalization process to the basis {1, t, i^} to obtain an ortho- 
normal basis {%(*), U2(t), u^(t)} of V. 

13.64. Let y be the vector space of 2 X 2 matrices over R with inner product defined by {A,B) — tr(B*A). 
(i) Show that the following is an orthonormal basis of V: 

'1 0\ /O 1\ /O 0\ /O ON 

,0 07' Vo o;- (,0 \)' Vo \, 

(ii) Find a basis for the orthogonal complement of (a) the diagonal matrices, (6) the symmetric 
matrices. 

13.65. Let If be a subset (not necessarily subspace) of V. Prove: (i) W = •E'(PF); (ii) if V has finite 
dimension, then W — I'(W). (Here UyV) is the space spanned by W.) 

13.66. Let W be the subspace spanned by a nonzero vector w in Y, and let E be the orthogonal projection 

{v, w) 
of V onto W. Prove B(d) = tj — t- w. We call E(v) the projection of v along w. 

\\w\\^ 

13.67. Find the projection of v along w if: 
(i) V = (1, -1, 2), w = (0, 1, 1) in R3; 

(ii) V -(l-i,2 + 3i),w = (2~-i,S) in C2; 

(iii) V = 2t — l, w = t^ in the space of Problem 13.55; 

/I 2\ /O -1\ 

(iv) v = I q)''*'~(i 9)^" *^® space of Problem 13.54. 

13.68. Suppose {mj, . . .,mJ is a basis of a subspace W of V where dim V = n. Let {vi, . . .,i;„_,} be an 
independent set of n — r vectors such that (Uj, Vj) = for each i and each j. Show that 
{■^1, . . .,a>„_r} is a basis of the orthogonal complement W . 



CHAP. 13] INNER PRODUCT SPACES 309 

13.69. Suppose {mi, . ..,u^) is an orthonormal basis for a subspace W of V. Let E :V -^V be the linear 
mapping defined by „, , , , , , , 

E(V) = {V, Mi>Mi + (V, M2>M2 + ' ' • + {V, ll^U, 

Show that E Is the orthogonal projection of V onto W. 

r 

13.70. Let {tti mJ be an orthonormal subset of V. Show that, for any v€.V, 2 K^'.Wi)!^ - ll^'ll^- 

(This Is known as Bessel's Inequality.) " 

13.71. Let y be a real inner product space. Show that: 

(I) ||m|| = ll'i'll If and only If {u + v,u — v) = Q; 

(II) ||ti + 1^112 = ||m||2 + 11^112 If and only If (u,v) = 0. 

Show by counterexamples that the above statements are not true for, say, C^. 

13.72. Let U and W be subspaces of a finite dimensional inner product space V. Show that: {i) (U+W) = 

U-^ nW-^; (11) (UnW)-^ = U-^ + W^. 

ADJOINT OPERATOR 

13.73. Let r : R3 ^ R3 be defined by Tix, y, z) = (x + 2y, Zx - Az, y). Find T* (x, y, «). 

13.74. Let r : C3 ^ 03 be defined by 

T(,x, y, z) = (ix + (2 + Zi)y, 3a! + (3 - i)z, (2 - hi)y + iz) 
Find T*{x,y,z). 

13.75. For each of the following linear functions ^ on F find a vector uG.V such that <i>(v) = {v, u) for 
every v G V: 



(i) 
(ii) 
(iii) ^ 



R3 -> R defined by <i>{x, y,z) — x + 2y — 32. 

C3 -» C defined by <f>{x, y, z) - ix + (2 + St)y + (1 - 2i)z. 

y -» R defined by <p(f) — /(I) where V is the vector space of Problem 13.63. 



13.76. Suppose V has finite dimension. Prove that the image of T* is the orthogonal complement of the 
kernel of T, i.e. Im T* = (Ker T)-^ . Hence rank(r) = rank(r*). 

13.77. Show that T*T = implies 7 = 0. 

13.78. Let V be the vector space of polynomials over R with inner product defined by (/, ff> = I f(t) g(t) dt. 

Let D be the derivative operator on V, i.e. D{f) = df/dt. Show that there is no operator D* on V 
such that (D(f),g) = {f,D*(g)) for every f,g &V. That is, D has no adjoint. 



UNITARY AND ORTHOGONAL OPERATORS AND MATRICES 

13.79. Find an orthogonal matrix whose first row is: (1) (l/VS, 2/v'5); (il) a multiple of (1,1,1). 

13.80. Find a symmetric orthogonal matrix whose first row is (1/3,2/3,2/3). (Compare with Problem 
13.27.) 

13.81. Find a unitary matrix whose first row is: (1) a multiple of (1, 1 — t); (ii) (\, \i, ^ — Ji) 

13.82. Prove: The product and inverses of orthogonal matrices are orthogonal. (Thus the orthogonal 
matrices form a group under multiplication called the orthogonal group.) 

13.83. Prove: The product and Inverses of unitary matrices are unitary. (Thus the unitary matrices 
form a group under multiplication called the unitary group.) 

13.84. Show that if an orthogonal (unitary) matrix is triangular, then it is diagonal. 



310 INNER PRODUCT SPACES [CHAP. 13 

13.85. Recall that the complex matrices A and B are unitarily equivalent if there exists a unitary matrix 
P such that B = P*AP. Show that this relation is an equivalence relation. 

13.86. Recall that the real matrices A and B are orthogonally equivalent if there exists an orthogonal 
matrix P such that B — P*AP. Show that this relation is an equivalence relation. 

13.87. Let W be a subspace of V. For any v&V let v = w + w' where wGWyW'e W^ . (Such a sum 
is unique because V=W®W^.) Let T-.V^V be defined by T(v) =w-w'. Show that T is 
a self-adjoint unitary operator on V. 

13.88. Let V be an inner product space, and suppose U : V -* V (not necessarily linear) is surjective (onto) 
and preserves inner products, i.e. {U(v), U(w)) = {u,w) for every v,w&V. Prove that U is 
linear and hence unitary. 

POSITIVE AND POSITIVE DEFINITE OPERATORS 

13.89. Show that the sum of two positive (positive definite) operators is positive (positive definite). 

13.90. Let r be a linear operator on V and let f-.V^-V-^K be defined by f{u, v) = (T{u), v). Show 
that / is itself an inner product on V if and only if T is positive definite. 

13.91. Suppose E is an orthogonal projection onto some subspace W of V. Prove that kl + E is positive 
(positive definite) if A; a= (A; > 0). 

13.92. Prove Theorem 13.13B, page 288, on positive definite operators. (The corresponding Theorem 
13.13A for positive operators is proved in Problem 13.41.) 

13.93. Consider the operator T defined by T(Ui) = VTjMi, i = 1, . . .,«, in the proof of Theorem 18^3A 
(Problem 13.41). Show that T is positive and that it is the only positive operator for which T^ - P. 

13.94. Suppose P is both positive and unitary. Prove that P = /. 

13.95. An « X M (real or complex) matrix A = (ajj) is said to be positive if A viewed as a linear operator 
on K» is positive. (An analogous definition defines a positive definite matrix.) Prove A is positive 
(positive definite) if and only if ay = a^ and 

n 

2 ayiCjSc] - (>0) 

i,3 = 1 

for every (a;i, ...,«;„) in K^. 

13.96. Determine which of the following matrices are positive (positive definite): 

(i) (ii) (iii) (iv) (v) (vi) 

13.97. Prove that a 2 X 2 complex matrix A = [ ) is positive if and only if (i) A= A*, and 



(ii) a, d and ad — be are nonnegative real numbers. 

13.98. Prove that a diagonal matrix A is positive (positive definite) if and only if every diagonal entry is 
a nonnegative (positive) real number. 

SELF-ADJOINT AND SYMMETRIC OPERATORS 

13.99. For any operator T, show that T + T* is self -adjoint and T - T* is skew-adjoint. 

13.100. Suppose T is self-adjoint. Show that THv) = implies T(v) = 0. Use this to prove that 
THv) = also implies T(v) = lor w > 0. 



CHAP. 13] INNER PRODUCT SPACES 311 

13.101. Let F be a complex inner product space. Suppose (T(v), v) is real for every v G V. Show that T 
is self-adjoint. 

13.102. Suppose S and T are self-adjoint. Show that ST is self-adjoint if and only if S and T commute, 
i.e. ST = TS. 

13.103. For each of the following symmetric matrices A, And an orthogonal matrix P for which P'AP is 
diagonal: 

13.104. Find an orthogonal transformation of coordinates which diagonalizes each quadratic form: 

(i) q{x, y) = 2x^ — 6xy + lOy^, (ii) q{x, y) = x'^ -\- Sxy — 5y^ 

13.105. Find an orthogonal transformation of coordinates which diagonalizes the quadratic form 
q(x, y, z) = 2xy + 2xz + 2yz. 

NORMAL OPERATORS AND MATRICES 

/2 i\ 

13.106. Verify that A = I . 1 is normal. Find a unitary matrix P such that P*AP is diagonal, and 
find P*AP. ^.* ''^ 

13.107. Show that a triangular matrix is normal if and only if it is diagonal. 

13.108. Prove that if T is normal on V, then ||r('y)|| = ||r*('u)|| for every vGY. Prove that the converse 
holds in complex inner product spaces. 

13.109. Show that self-adjoint, skew-adjoint and unitary (orthogonal) operators are normal. 

13.110. Suppose T is normal. Prove that: 

(i) T is self-adjoint if and only if its eigenvalues are real. 

(ii) T is unitary if and only if its eigenvalues have absolute value 1. 

(iii) T is positive if and only if its eigenvalues are nonnegative real numbers. 

13.111. Show that if T is normal, then T and T* have the same kernel and the same image. 

13.112. Suppose S and T are normal and commute. Show that S+T and ST are also normal. 

13.113. Suppose T is normal and commutes with S. Show that T also commutes with S*. 

13.114. Prove: Let S and T be normal operators on a complex finite dimensional vector space V. Then 
there exists an orthonormal basis of V consisting of eigenvectors of both S and T. (That is, S and 
T can be simultaneously diagonalized.) 

ISOMORPHISM PROBLEMS 

13.115. Let {ei, . . . , e„} be an orthonormal basis of an inner product space V over K. Show that the map 
V ^[v]g is an (inner product space) isomorphism between V and X". (Here [v]^ denotes the co- 
ordinate vector of v in the basis {cj}.) 

13.116. Show that inner product spaces V and W over K are isomorphic if and only if V and W have the 
same dimension. 

13.117. Suppose {ej, ...,ej and {ei . . . , e^} are orthonormal bases of V and W respectively. Let T : V ^ W 
be the linear map defined by T{ei) = e(, for each i. Show that T is an isomorphism. 



312 INNER PRODUCT SPACES [CHAP. 13 

13.118. Let V be an inner product space. Recall (pag:e 283) that each uG V determines a linear functional 
u in the dual space V* by the definition u (v) = {v, u) for every v E.V. Show that the map 
M H- M is linear and nonsingular, and hence an isomorphism from V onto V*. 

13.119. Consider the inner product space V of Problem 13.54. Show that V is isomorphic to R"*" under the 
mapping 

/«U «12 •• • «ln \ 
\^m\ ^m2 • • • ^mnl 

where Ri = (ctii, ajj, . . .,«i„), the fth row of A. 

MISCELLANEOUS PROBLEMS 

13.120. Show that there exists an orthonormal basis {mj, . . .,m„} of V consisting of eigenvectors of T if and 
only if there exist orthogonal projections Ei,...,E^ and scalars \i,...,\ such that: (i) T = 
\iEi + • • ■ + X^^; (ii) Ei+ ■■■ + Er = I; (iii) EiEj = for i ¥■ j. 

13.121. Suppose V = U®W and suppose TiiU^V and T^:W-^V are linear. Show that T = 
Ti © T2 is also linear. (Here T is defined as follows: if t) e V and v = u + w where uG:U,wG.W, 
then T(v) = T.^(u) + TgC^)-) 

13.122. Suppose U is an orthogonal operator on R3 with positive determinant. Show that U is either a 
rotation or a reflection through a plane. 



Answers to Supplementary Problems 

13.47. (ii) /c > 9; (iii) a > 0, d > 0, ad - 6c > 

13.48. (i) VE, (ii) VH 
13.50. (i) 3V2, (ii) 5V2 

13.56. (i) IJMll =V65/12, (ii) ||i;|| = 2\/ll, (iii) \\f{t)\\ = ^/83/15, (iv) ||A|| = V30 

13.60. (i) cos e = 9/\/420, (ii) cos e = \/l5/6, (iii) cos e - 2/y/2\a 

13.61. {vi = (1,2,1,0), v^ = (4,4,0,1)} 

13.62. {vi = (1, i, 1)/VS, V2 = (2i, 1 - Si, 3 - i)/V24} 

13.63. (i) {/i it) = 7t2 - 5«, fzit) = 12*2 _ 5} 

(ii) {Mj(t) = 1, M2(t) = (2t - l)/\/3, M3(t) = (6*2 - 6f + 1)/a/5 } 

, r- / 7/V6 

13.67. (i) (0,l/\/2, 1/^/2), (ii) (26 + 7t, 27 + 24t)/V14, (iii) V5 tVS, (iv) i _ r- ...^ 

13.73. r*(a;, y, z) - (« + 82/, 2a; + x, -4j/) 

13.74. T*{«,y,z) = (-iae + Sj/, (2 - 3t)a; + (2 + 5i)z, (3 + % - iz) 



CHAP. 13] INNER PRODUCT SPACES 313 

13.75. Let u = ^(ei)ei + • • ■ + 0(e„)e„ where {ej} is an orthonormal basis. 

(i) u = (1, 2, -3), (ii) u = i-i, 2 - 3i, 1 + 20, (iii) u = (18t2 -8t + 13)/15 

^ / \2/\/6 -1/V6 -l/x/e/ 

'l/3 2/3 2/3^ 
13.80. 1 2/3 -2/3 1/3 

,2/3 1/3 -2/3/ 

/ l/x/3 (l-0/V3\ / * *^ *"*'■' 

13.96. Only (i) and (v) are positive. Moreover, (v) is positive definite. 

1.103. (i) p =. ( 2/^ -1/f y (ii) p = f 2/f -i/^y (iii) p = I ^'^ -^'^ 

-l/\/5 2/V^/ \-l/\/5 2/\/5/ \-l/v^ 3/\/T0 

13.104. (i) X = (3a;' - j/')/\/lO, v = (a:' + 3j/')/\/l0, (ii) x = (2a:' - 2/')/\/5, y = (x' + 2j/')/a/5 

13.105. a; = x'Jy/l + y'lyfl + 2'/v/6, y = a;'/v/3 - j/'/^/B + z't^f^, z = a;'/\/3 - ^z'l^T^ 

[llyfi -l/\/2\ /2 + i 

13.106. P = V V . ^*^P = ('■^ ' 



\1/V2 1/a/2/' 



2 - i 



Appendix A 



Sets and Relations 



SETS, ELEMENTS 

Any well defined list or collection of objects is called a set; the objects comprising the 
set are called its elements or members. We write 

p G A if p is an element in the set A 

If every element of A also belongs to a set B, i.e. if a; G A implies x G B, then A is called 
a subset of B or is said to be contained in B; this is denoted by 

AgB or BdA 

Two sets are equal if they both contain the same elements; that is, 

A = B if and only if AcB and BcA 

The negations of p GA, AcB and A — B are written p ^ A, A^B and A¥'B 
respectively. 

We specify a particular set by either listing its elements or by stating properties which 
characterize the elements in the set. For example, 

A - (1,3,5,7,9} 

means A is the set consisting of the numbers 1, 3, 5, 7 and 9; and 

B — (a; : a; is a prime number, x < 15} 

means that B is the set of prime numbers less than 15. We also use special symbols to 
denote sets which occur very often in the text. Unless otherwise specified: 

N = the set of positive integers: 1, 2, 3, ... ; 

Z = the set of integers: ...,—2,-1,0,1,2,...; 

Q = the set of rational numbers; 

R = the set of real numbers; 

C = the set of complex numbers. 

We also use to denote the empty or null set, i.e. the set which contains no elements; this 
set is assumed to be a subset of every other set. 

Frequently the members of a set are sets themselves. For example, each line in a set 
of lines is a set of points. To help clarify these situations, we use the words cUiss, collection 
and family synonymously with set. The words subclass, subcoUection and subfamily have 
meanings analogous to subset. 

Example A.l : The sets A and B above can also be written as 

A = {a; G N : a; is odd, a; < 10} and B = {2,3,5,7,11,13} 

Observe that 9GA but 9 g B, and 11GB but 11 € A; whereas 3GA and 
3 GB, and 6 € A and 6 g B. 



Example A.2 
Example A.3 
Example A.4 



The sets of numbers are related as follows: NcZcQcRcC. 

Let C = {x : x^ = 4, X is odd}. Then C = 0, that is, C is the empty set. 

The members of the class {{2, 3}, {2}, {5, 6}} are the sets {2, 3}, {2} and {5, 6}. 



315 



316 



SETS AND RELATIONS 



[APPENDIX A 



The following theorem applies. 

Theorem A.l: Let A, B and C be any sets. Then: (i) Ac A; (ii) if AcB and B(zA, 
then A = B; and (iii) if A c B and BcC, then AcC. 

We emphasize that AcB does not exclude the possibility that A = B. However, if 
AcB but A¥' B, then we say that A is a proper subset of B. (Some authors use the 
symbol c for a subset and the symbol c only for a proper subset.) 

When we speak of an indexed set {at: i G I}, or simply {Oi}, we mean that there is a 
mapping ^ from the set / to a set A and that the image 4>{i) oi i & I is denoted (M. The set 
/ is called the indexing set and the elements a, (the range of <j>) are said to be indexed by /. 
A set (tti, a2, . . . } indexed by the positive integers N is called a sequence. An indexed class 
of sets {Ai : i G I), or simply (Ai), has an analogous meaning except that now the map 4, 
assigns to each i G I a set Ai rather than an element a,. 

SET OPERATIONS 

Let A and B be arbitrary sets. The union of A and B, written AuB, is the set of 
elements belonging to A or to B; and the intersection of A and B, written AnB, is the set 
of elements belonging to both A and B: 

AUB - {x: xG A or xGB} and AnB = {x: x GA and x G B} 

If AnB = 0, that is, if A and B do not have any elements in common, then A and B are 
said to be disjoint. 

We assume that all our sets are subsets of a fixed universal set (denoted here by U). 
Then the complement of A, written A'=, is the set of elements which do not belong to A: 

A<= = {X gU: x^A) 

ExamplA AJ5: The following diagrams, called Venn diagrams, illustrate the above set operations. 
Here sets are represented by simple plane areas and U, the universal set, by the 
area in the entire rectangle. 




AuB is shaded 




AnB is shaded 



QD 



A<^ is shaded 



APPENDIX A] SETS AND RELATIONS 317 

Sets under the above operations satisfy various laws or identities which are listed in 
the table below. In fact, we state 

Theorem A.2: Sets satisfy the laws in Table 1. 



LAWS OF THE ALGEBRA OF SETS 


la. 


Idempotent Laws 

AuA = A lb. AnA = A 


2a. 


Associative Laws 

(AuB)uC - Au(BuC) 2b. (AnB)nC = An(BnC) 


3a. 


Commutative Laws 

AuB - BuA 3b. AnB = BnA 


4a. 


Distributive Laws 

Au(BnC) = (AuB)n(AuC) 4b. An(BuC) = (AnB)u(AnC) 


5a. 
6a. 


Identity Laws 

Au0 = A 5b. AnU = A 
AdU = U 6b. An0 = 


7a. 
8a. 


Complement Laws 

AuA= = U 7b. AnA<; = 
(A':)<= = A 8b. U<: = 0, 0c = [7 


9a. 


De Morgan's Laws 

(A\jBY = A'ni?» 9b. {AnBY = A<'uB<= 



Table 1 

Remark: Each of the above laws follows from an analogous logical law. For example, 

AnB = [x: xGA and x G B} = {x: xGB and x G A) = BnA 

(Here we use the fact that the composite statement "p and q", written p a g, is 
logically equivalent to the composite statement "q and p", i.e. q a p.) 

The relationship between set inclusion and the above set operations follows. 

Theorem A.3: Each of the following conditions is equivalent to AcB: 

(i) AnB = A (iii) B<^cA<^ (v)BuA==C7 

(ii) AuB = i? (iv) AnB' = 

We generalize the above set operations as follows. Let [Ai : i S 7} be any family of 
sets. Then the union of the Ai, written U^^,A^ (or simply UjAi), is the set of elements 
each belonging to at least one of the Ai; and the intersection of the At, written n^^^A^ or 
simply n i Ai, is the set of elements each belonging to every Ai. 

PRODUCT SETS 

Let A and B be two sets. The product set of A and B, denoted hy AxB, consists of all 
ordered pairs (a, 6) where aG A and b G B: 

AxB = {{a, b): aGA, bGB} 
The product of a set with itself, say Ax A, is denoted by A". 



318 



SETS AND RELATIONS 



[APPENDIX A 



Example A.6: The reader is familiar with the cartesian plane R^ = R x R as shown below. Here 
each point P represents an ordered pair {a, b) of real numbers, and vice versa. 



-3 



---1 



• P 



1 2^3 



Example A.7: Let A = {1, 2, 3} and B = {a, 6}. Then 

AXB = {(1, a), (1, 6), (2, a), (2, 6), (3, a), (3, b)} 

Remark: The ordered pair (a, 6) is defined rigorously by {a, b) = {{a}, {a, b}}. From this 
definition, the "order" property may be proven; that is, {a, b) = (c, d) if and only 
if a = c and b = d. 

The concept of product set is extended to any finite number of sets in a natural way. 
The product set of the sets Ai, . . . , Am, written Ai x A2 x • • • x Am, is the set consisting of 
all TO-tuples (ci, a2, . . . , fflm) where ai G A. for each i. 

RELATIONS 

A binary relation or simply relation R from a set A to a set B assigns to each ordered 
pair {a,b) G Ax B exactly one of the following statements: 

(i) "a is related to b", written aRb, 

(ii) "a is not related to b", written a^b. 

A relation from a set A to the same set A is called a relation in A. 

Example A.8: Set inclusion is a relation in any class of sets. For, given any pair of sets A and B, 
either A cB or A ():B. 

Observe that any relation R from A to B uniquely defines a subset R of Ax B as follows: 

R = {(a, b): aRb} 

Conversely, any subset R ot AxB defines a relation from A to B as follows: 

aRb if and only if (a, b) G R 

In view of the above correspondence between relations from A to B and subsets of A x B, 
we redefine a relation as follows: 

Definition: A relation i? from A to B is a subset of A x 5. 



EQUIVALENCE RELATIONS 

A relation in a set A is called an equivalence relation if it satisfies the following axioms: 
[El] Every a € A is related to itself. 
[E2] If a is related to 6, then b is related to a. 
[Es] If a is related to b and b is related to c, then a is related to c. 

In general, a relation is said to be reflexive if it satisfies [Ei], symmetric if it satisfies [Ez], 
and transitive if it satisfies [£"3]. In other words, a relation is an equivalence relation if 
it is reflexive, symmetric and transitive. 



APPENDIX A] 



SETS AND RELATIONS 



319 



Example A.9: Consider the relation C of set inclusion. By Theorem A.l, A c A for every set A; 
and a AqB and B<zC, then A c C. That is, C is both reflexive and transitive. 
On the other hand, C is not symmetric, since A c B and A ¥= B implies B cjiA. 

Example A.IO: In Euclidean geometry, similarity of triangles is an equivalence relation. For if 
a, p and V are any triangles, then: (i) a is similar to itself; (ii) if a is similar to yS, 
then p is similar to a; and (iii) if a is similar to p and /8 is similar to y, then a is 
similar to y. 

If R is an equivalence relation in A, then the equivalence class of any element a G A, 
denoted by [a], is the set of elements to which a is related: 

[a] = [x: aRx) 

The collection of equivalence classes, denoted by A/R, is called the quotient of A by R: 

AIR = {[a]: a G A} 

The fundamental property of equivalence relations follows: 

Theorem A.4: Let R be an equivalence relation in A. Then the quotient set A/R is a 
partition of A, i.e. each aGA belongs to a member of A/R, and the mem- 
bers of A/R are pairwise disjoint. 

Example A.11 : Let R5 be the relation in Z, the set of integers defined by 

X = y (mod 5) 

which reads "x is congruent to y modulo 5" and which means "x - y is divisible by 5". 
Then ^5 is an equivalence relation in Z. There are exactly five distinct equivalence 
classes in Z/R^: 

Ao == {...,-10,-5,0,5,10} 

Ai = {...,-9,-4,1,6,11} 

A2 = {...,-8,-3,2,7,12} 

A3 = {...,-7,-2,3,8,13} 

A4 = {...,-6,-1,4,9,14} 

Now each integer x is uniquely expressible in the form x = 5q + r where - r < 5; 
observe that x G E^ where r is the remainder. Note that the equivalence classes 
are pairwise disjoint and that Z = A0UA1UA2UA3UA4. 



Appendix B 



Algebraic Structures 



INTRODUCTION 

We define here algebraic structures which occur in almost all branches of mathematics. 
In particular we will define a field which appears in the definition of a vector space. We 
begin with the definition of a group, which is a relatively simple algebraic structure with 
only one operation and is used as a building block for many other algebraic systems. 

GROUPS 

Let G be a nonempty set with a binary operation, i.e. to each pair of elements a,b GG 
there is assigned an element ab G G. Then G is called a group if the following axioms hold: 

[Gi] For any a,b,c G G, we have {ab)c = a{bc) (the associative law). 

[G2] There exists an element e GG, called the identity element, such that ae = ea = a for 
every a GG. 

[Ga] For each a GG there exists an element a'^GG, called the inverse of a, such that 
aa~^ = a~^a = e. 

A group G is said to be abelian (or: commutative) if the commutative law holds, i.e. if 
ab = ha for every a,h GG. 

When the binary operation is denoted by juxtaposition as above, the group G is said 
to be written multiplicatively. Sometimes, when G is abelian, the binary operation is de- 
noted by + and G is said to be written additively. In such case the identity element is 
denoted by and is called the zero element; and the inverse is denoted by —a and is called 
the negative of a. 

If A and B are subsets of a group G then we write 

AB = {ab: aGA, bGB}, or A + B = {a + b: a G A, b G B} 
We also write a for {a}. 

A subset H of a group G is called a subgroup of G if H itself forms a group under the 
operation of G. If ^ is a subgroup of G and aGG, then the set Ha is called a right coset 
of H and the set aH is called a left coset of H. 

Definition: A subgroup H of G is called a normal subgroup if a-^HacH for every aGG. 
Equivalently, H is normal if aH = Ha for every aGG, i.e. if the right and 
left cosets of H coincide. 

Note that every subgroup of an abelian group is normal. 

Theorem B.1: Let f? be a normal subgroup of G. Then the cosets of i? in G form a group 
under coset multiplication. This group is called the quotient group and is 
denoted by G/H. 

Example B.1 : The set Z of integers forms an abelian group under addition. (We remark that the 
even integers form a subgroup of Z but the odd integers do not.) Let H denote the 
set of multiples of 5, i.e. H = {. . ., -10, -5, 0, 5, 10, . . .}. Then H is a subgroup 
(necessarily normal) of Z. The cosets of H in Z follow: 

320 



APPENDIX Bj 



ALGEBRAIC STRUCTURES 



321 



Example B^: 



= + H = H = {...,-10,-5,0,5,10,...} 

1 = 1 + i? = {. . ., -9, -4, 1, 6, 11, . . .} 

2 = 2 + H = {..., -8, -3, 2, 7, 12, . . .} 

3 = 3 + H = {. . ., -7, -2, 3, 8, 13, . . .} 

4 = 4, + H = {...,-6,-1,4,9,14, ...} 

For any other integer w S Z, n = w + H coincides with one of the above cosets. 
Thus by the above theorem, Z/H = {0, 1, 2, 3, 4} forms a group under coset addition; 
its addition table follows: 



+ 





T 


2 


3 


4 








T 


2 


3 


4 


T 


T 


2 


3 


4 





2 


2 


3 


4 





1 


3 


3 


4 





I 


2 


4 


4 





1 


2 


3 



This quotient group Z/H is referred to as the integers modulo 5 and is frequently 
denoted by Z5. Analogously, for any positive integer n, there exists the quotient 
group Z„ called the integers modulo n. 

The permutations of n symbols (see page 171) form a group under composition of 
mappings; it is called the symmetric group of degree n and is denoted by S„. We 
investigate S3 here; its elements are 



<'i 



"2 



"a 



<t>i 



02 — 



(3 a \) 
(III) 



Here 



n 2 3\ 

f . . , 1 is the permutation which maps 1 ►-» t, 2 I-* j, 



3 
} k 
tiplication table of S3 is 



The mul- 





e 


<'l 


«'2 


"3 


01 


02 


6 


e 


"1 


"2 


"3 


01 


02 


"1 


"1 


€ 


01 


02 


"2 


''S 


"2 


"2 


H 


e 


01 


''S 


"1 


"3 


"3 


01 


02 


e 


"1 


"2 


01 


•Pi 


•'a 


<fl 


"2 


02 


e 


<f>2 


02 


"2 


"3 


<'l 


€ 


01 



(The element in the ath row and 6th column is ab.) The set H = {e, ffj is a sub- 
group of S3; its right and left cosets are 

Right Cosets Left Cosets 

H = {e,<r,} H = {e,„,} 

H^l = {01, 02} <f>iH — {^j, org} 

H<t>2 = {02, tfs} 02-^ = {02><'2} 

Observe that the right cosets and the left cosets are distinct; hence H is not a normal 
subgroup of S3. 

A mapping / from a group G into a group G' is called a homomorphism if /(a6) = 
/(a)/(b) for every a.bGG. (If / is also bijective, i.e. one-to-one and onto, then / is called 
an isomorphism and G and G' are said to be isomorphic.) If f:G-*G' is a homomorphism, 
then the feerraei of / is the set of elements of G which map into the identity element e' e G': 

kernel of / = {aGG: f(a) = e'} 

(As usual, /(G) is called the image of the mapping /: G^G'.) The following theorem 
applies. 

Theorem B.2: Let /: G-» G' be a homomorphism with kernel K. Then X is a normal 
subgroup of G, and the quotient group GIK is isomorphic to the image of /. 



322 ALGEBRAIC STRUCTURES [APPENDIX B 

Example B^: Let G be the group of real numbers under addition, and let G' be the group of 
positive real numbers under multiplication. The mapping f : G -* G' defined by 
/(a) — 2" is a homomorphism because 

f(a+b) = 2° + " = 2''2i' = f{a)f(b) 

In particular, / is bijective; hence G and G' are isomorphic. 

Example B.4: Let G be the group of nonzero complex numbers under multiplication, and let G' 
be the group of nonzero real numbers under multiplication. The mapping f : G -* G' 
defined by f(z) — \z\ is a homomorphism because 

/(ziZa) = |ziZ2| = |zi| [zal = /(^i) f(H) 
The kernel K of f consists of those complex numbers z on the unit circle, i.e. for 
which \z\ = 1. Thus G/K is isomorphic to the image of /, i.e. to the group of positive 
real numbers under multiplication. 

RINGS, INTEGRAL DOMAINS AND FIELDS 

Let i? be a nonempty set with two binary operations, an operation of addition (denoted 
by +) and an operation of multiplication (denoted by juxtaposition). Then R is called a 
ring if the following axioms are satisfied: 
[Ri] For any a,b,e G R, we have {a + h) + c = a + (6 + c). 

[Ri] There exists an element G /?, called the zero element, such that a + = + a = a 
for every aGR. 

[Rs\ For each a G J? there exists an element —a G R, called the negative of a, such that 

a + (—a) = (—a) + a = 0. 
[R^ For any a,b G R, we have a + b = b + a. 
[Rs] For any a,b,cG R, we have {ab)c = a{bc). 
[Re] For any a,b,c G R, we have: 

(i) a{b + c) = ab + ac, and (ii) (b + e)a =ba + ca. 

Observe that the axioms [Ri] through [Rt] may be summarized by saying that R is an 
abelian group under addition. 

Subtraction is defined iniJby a — b = a + (— &). 

It can be shown (see Problem B.25) that a- = • a = for every a G R. 

R is called a commutative ring if ab — ba for every a,b G R. We also say that R is 
a ring with a unit element if there exists a nonzero element 1 G R such that o • 1 = 1 • a = o 
for every a G R. 

A nonempty subset S of i? is called a subring of R if S itself forms a ring under the 
operations of R. We note that S is a subring of R if and only if a, & G S implies a-b GS 
and ab G S. 

A nonempty subset / of jB is called a left ideal in R if: (i) a — 6 G / whenever a,b G I, 
and (ii) ra G I whenever r GR, aG I. Note that a left ideal I in R is also a subring of R. 
Similarly we can define a right ideal and a two-sided ideal. Clearly all ideals in com- 
mutative rings are two-sided. The term ideal shall mean two-sided ideal unless otherwise 
specified. 

Theorem B.3: Let / be a (two-sided) ideal in a ring R. Then the cosets {a + I: aGR} 
form a ring under coset addition and coset multiplication. This ring is 
denoted by R/I and is called the quotient ring. 

Now let R he a commutative ring with a unit element. For any aGR, the set 
(a) = {ra: r G R} is an ideal; it is called the principal ideal generated by a. If every ideal 
in iZ is a principal ideal, then R is called a principal ideal ring. 

Definition: A commutative ring R with a unit element is called an integral domain it R 
has no zero divisors, i.e. if ab — implies a = or b = 0. 



APPENDIX B] 



ALGEBRAIC STRUCTURES 



323 



Definition: A commutative ring R with a unit element is called a field if every nonzero 
a E R has a multiplicative inverse, i.e. there exists an element a~^ E R such 
that aa~i = a~^a — 1. 

A field is necessarily an integral domain; for if ab = and a t^ 0, then 

b = I'b = a-^ab - a-^'O = 

We remark that a field may also be viewed as a commutative ring in which the nonzero 
elements form a group under multiplication. 

Example B5: The set Z of integers with the usual operations of addition and multiplication is the 
classical example of an integral domain with a unit element. Every ideal / in Z is 
a principal ideal, i.e. / = (m) for some integer n. The quotient ring Z„ = Z/(ji) 
is calle^ the ring of integers modulo n. If n is prime, then Z„ is a field. On the 
other hand, if n is not prime then Z„ has zero divisors. For example, in the ring Zg, 
2 3=0 and 2^0 and 3 # 0. 

Example B.6: The rational numbers Q and the real numbers R each form a field with respect 
to the usual operations of addition and multiplication. 

Example B.7: Let C denote the set of ordered pairs of real numbers with addition and multiplica- 
tion defined by 

(a, 6) + (c, d) = (a + e,b + d) 

(a, 6) '(c, d) = {ae —bd, ad + be) 

Then C satisfies all the required properties of a field. In fact, C is just the field of 
complex numbers (see page 4). 

Example B.8: The set M of all 2 by 2 matrices with real entries forms a noncommutative ring with 
zero divisors under the operations of matrix addition and matrix multiplication. 

Example B.9: Let R be any ring. Then the set jB[a;] of all polynomials over R forms a ring with 
respect to the usual operations of addition and multiplication of polynomials. 
Moreover, if R is an integral domain then R[x] is also an integ^ral domain. 

Now let D be an integral domain. We say that b divides a in D if a = bc for some 
c G D. An element u G D ia called a unit if u divides 1, i.e. if u has a multiplicative inverse. 
An element b GD is called an associate of a G D if b = ua for some unit uG D. A 
nonunit p G D is said to be irreducible if p = ab implies a or 6 is a unit. 

An integral domain D is called a unique factorization domain if every nonunit a G D 
can be written uniquely (up to associates and order) as a product of irreducible elements. 

Example BJO: The ring Z of integers is the classical example of a unique factorization domain. 
The units of Z are 1 and —1. The only associates of n G. Z are n and —n. The 
irreducible elements of Z are the prime numbers. 

Example B.ll : The set D = {a+ b^/JS : a, b integers} is an integral domain. The units of D 
are ±1, 18 ± 5^13 and -18 ± 5\/l3. The elements 2, 3 - Vl3 and -3 - Vl3 are 
irreducible in D. Observe that 4 = 2 • 2 = (3 - Vl3 )(— 3 - Vi3 ). Thus D is not 
a unique factorization domain. (See Problem B.40.) 



MODULES 

Let M be a nonempty set and let Rhe a. ring with a unit element. Then M is said to be a 
(left) R-module if M is an additive abelian group and there exists a mapping RxM-* M 
which satisfies the following axioms: 



324 ALGEBRAIC STRUCTURES [APPENDIX B 

[Ml] r(mi + mz) — rwi + rm2 

[Mz] (r + s)m = rm + sm 

[Ma] {rs)m = r{sm) 

[M4] I'm — m 

for any r,s GR and any mi G M. 

We emphasize that an JR-module is a generalization of a vector space where we allow 
the scalars to come from a ring rather than a field. 

Example B.12: Let G be any additive abelian group. We make G into a module over the ring Z of 

integers by defining 

n times 



nff = ff + ff+---+ff, Oflr = 0, {-n)ff = -ng 
where n is any positive integer. 

Example B.13: Let iJ be a ring and let / be an ideal in R. Then / may be viewed as a module over R. 

Example B.14: Let V be a vector space over a field K and let T : y -» V be a linear mapping. 
We make V into a module over the ring K[x\ of polynomials over K by defining 
f(x)v = f(T) (v). The reader should check that a scalar multiplication has been 
defined. 

Let iJf be a module over R. An additive subgroup AT of iW is called a submodule of M 
it uGN and kGR imply ku G N. (Note that N is then a module over R.) 

Let M and M' be /2-modules. A mapping T : M-* M' is called a homomorphism (or: 
R-homomorphism or R-linear) if 

(i) T{u + v) = T{u) + T(v) and (ii) T{ku) = kT{u) 

for every u,v G M and every kGR. 



Problems 

GROUPS 

BJ. Determine whether each of the following systems forms a group G: 

(i) G — set of integers, operation subtraction; 

(ii) G = {1, —1}, operation multiplication; 

(iii) G = set of nonzero rational numbers, operation division; 

(iv) G = set of nonsingular nXn matrices, operation matrix multiplication; 

(v) G = {a+bi: a,b e Z}, operation addition. 

B.2. Show that in a group G: 

(i) the identity element of G is unique; 

(ii) each a S G has a unique inverse a~^ G G; 

(iii) (o-i)-i = a, and (ab)-^ = fe-ia-i; 

(iv) 0.6 = ac implies b = c, and 6a = ca implies 6 = c. 

B.3. In a group G, the powers of a G G are defined by 

a" = e, a" = aa"~i, o~" = (a")~i, where nGN 

Show that the following formulas hold for any integers r,8,tS Z: (i) a^'a^ — a^+'^ (ii) (a"")' = a", 
(iii) («>•+«)« = a'-t+st. 



APPENDIX B] ALGEBRAIC STRUCTURES 325 

B.4. Show that if G is an abelian group, then (a&)» = a^bn for any a, 6 G G and any Integer « G Z. 

B.5. Suppose G is a group such that {ab)^ = a^b^ for every a, 6 G G. Show that G is abelian. 

B.6. Suppose if is a subset of a group G. Show that H is a subgroup of G if and only if (i) H is non- 
empty, and (ii) a,b G H implies o6~i G H. 

B.7. Prove that the intersection of any number of subgroups of G is also a subgroup of G. 

B.8. Show that the set of all powers of a G G is a subgroup of G; it is called the cyclic group generated 
by a. 

B.9. A group G is said to be cyclic if G is generated by some aG G, i.e. G = {a." : n G Z}. Show that 
every subgroup of a cyclic group is cyclic. 

B.IO. Suppose G is a cyclic subgroup. Show that G is isomorphic to the set Z of integers under addition 
or to the set Z„ (of the integers modulo n) under addition. 

B.ll. Let H be a subgroup of G. Show that the right (left) cosets of H partition G into mutually disjoint 
subsets. 

B.12. The order of a group G, denoted by |G|, is the number of elements of G. Prove Lagrange's theorem: 
If H is a subgroup of a finite group G, then \H\ divides \G\. 

B.13. Suppose \G\ — p where p is prime. Show that G is cyclic. 

B.14, Suppose H and N are subgroups of G with N normal. Show that (i) HN is a subgroup of G and 
(ii) HnN is a normal subgroup of G. 

B.15. Let H be a subgroup of G with only two right (left) cosets. Show that H is a normal subgroup of G. 

B.16. Prove Theorem B.l: Let H he a. normal subgroup of G. Then the cosets of H in G form a group 
G/H under coset multiplication. 

B.17. Suppose G is an abelian group. Show that any factor group G/H is also abelian. 

B.18. Let f : G -* G' be a group homomorphism. Show that: 

(i) /(e) = e' where e and e' are the identity elements of G and G' respectively; 
(ii) /(a-i) = /(a)-i for any a G G. 

B.19. Prove Theorem B.2: Let f : G -* G' be a group homomorphism with kernel K. Then K is a normal 
subgroup of G, and the quotient group G/K is isomorphic to the image of /. 

B.20. Let G be the multiplicative group of complex numbers z such that \z\ = 1, and let B be the additive 
group of real numbers. Prove that G is isomorphic to R/Z. 

B.21. For a fixed fir G G, let g : G ^ G be defined by g(a) = g-^ag. Show that G is an isomorphism of 
G onto G. 

B.22. Let G be the multiplicative group of n X w nonsingular matrices over R. Show that the mapping 
A h* |A| is a homomorphism of G into the multiplicative g:roup of nonzero real numbers. 

B.23. Let G be an abelian group. For a fixed w G Z, show that the map a l-» a" is a homomorphism 
of G into G. 

B.24. Suppose H and N are subgroups of G with N normal. Prove that HnN is normal in H and 
H/{HnN) is isomorphic to HN/N. 

RINGS 

B.25. Show that in a ring R: 

(i) o • = • a. = 0, (ii) o(-6) = (-a)b = -ab, (iii) (-o)(-6) = ab. 

B.26. Show that in a ring R with a unit element: (i) (— l)a = —a, (ii) (— 1)(— 1) = 1. 



326 ALGEBRAIC STRUCTURES [APPENDIX B 

B.27. Suppose a^ = a for every a e i?. Prove that i? is a commutative ring. (Such a ring is called a 
Boolean ring.) 

B.28. Let i? be a ring with a unit element. We make R into another ring R by defining a®b = a+b + l 
and^ U'b = ab + a+b. (i) Verify that fi is a ring, (ii) Determine the 0-element and 1-element 
of R. 

B.29. Let G be any (additive) abelian group. Define a multiplication in G by a-b = 0. Show that this 
makes G into a ring. 

B.30. Prove Theorem B.3: Let / be a (two-sided) ideal in a ring R. Then the cosets {a + I:. a G R} form 
a ring under coset addition and coset multiplication. 

B.31. Let /j and I^ be ideals in R. Prove that /j + 1^ and hnlz are also ideals in R. 

B.32. Let R and R' be rings. A mapping f : R ^ R' is called a homomorphism (or: ring homomorphism) if 

(i) f(a + b) =■ f(a) + f{b) and (ii) f(ab) = /(a)/(6), 

for every a,bGR. Prove that if f : R ^ R' is a homomorphism, then the set K = {r G R : f{r) = 0} 
is an ideal in R. (The set K is called the kernel of /.) 

INTEGRAL DOMAINS AND FIELDS 

B.33. Prove that in an integral domain D, if ab = ae, a ¥= 0, then b = c. 

B.34. Prove that F = {a + byji : a, b rational} is a field. 

B.35. Prove that D = {a+ 6\/2 : a, b integers} is an integral domain but not a field. 

B.36. Prove that a finite integral domain D is a field. 

B.37. Show that the only ideals in a field K are {0} and K. 

B.38. A complex number a + bi where a, b are integers is called a Gaussian integer. Show that the set G 
of Gaussian integers is an integral domain. Also show that the units in G are ±1 and ±i. 

B.39. Let D be an integral domain and let / be an ideal in D. Prove that the factor ring D/I is an integral 
domain if and only if / is a prime ideal. (An ideal / is prime if ab G I implies aG I or b&I.) 

B.40. Consider the integral domain D = {a + b\/l3 : a, b integers} (see Example B.ll). If a — a+ by/Ts , 
we define N(a) = a^-lSb^. Prove: (1) NiaP) = N{a)N(fi); (ii) a is a unit if and only if N(a) = ±1; 
(iii) the units of D are ±1, 18 ± 5-\/l3 and -18 ± 5\/l3 ; (iv) the numbers 2, 3 - a/13 and -3 - y/Ts 
are irreducible. 

MODULES 

B.41. Let M be an iJ-module and let A and B be submodules of M. Show that A+B and AnB are also 
submodules of M. 

B.42. Let M be an iZ-module with submodule N. Show that the cosets {u + N : u G M} form an iJ-module 
under coset addition and scalar multiplication defined by r{u + N) = ru + N. (This module is de- 
noted by M/N and is called the quotient module.) 

B.43. Let M and M' be B-modules and let f : M -* M' be an iZ-homomorphism. Show that the set 
K = {uGM: f(u) = 0} is a submodule of /. (The set K is called the kernel of /.) 

B.44. Let M be an i?-module and let E(M) denote the set of all fi-homomorphism of M into itself. Define 
the appropriate operations of addition and multiplication in E{M) so that E(M) becomes a ring. 



Appendix C 



Polynomials over a Field 



INTRODUCTION 

We will investigate polynomials over a field K and show that they have many properties 
which are analogous to properties of the integers. These results play an important role 
in obtaining canonical forms for a linear operator T on a vector space V over K. 



RING OF POLYNOMIALS 

Let X be a field. Formally, a polynomial / over K is an infinite sequence of elements 
from K in which all except a finite number of them are 0: 

/ = ( . . . , 0, On, . . . , ai, tto) 

(We write the sequence so that it extends to the left instead of to the right.) The entry ak 

is called the kth coefficient of /. If n is the largest integer for which a„ ¥- 0, then we say 

that the degree of / is n, written 

deg/ = n 

We also call a„ the leading coefficient of /, and if a„ = 1 we call / a monic polynomial. On 
the other hand, if every coefficient of / is then / is called the zero polynomial, written 
/ = 0. The degree of the zero polynomial is not defined. 

Now if g is another polynomial over K, say 

g - {.. .,0,bm, . . .,bi,bo) 

then the sum f + g is the polynomial obtained by adding corresponding coefficients. That 
is, if m — n then 

f + g = ( . . . , 0, a„, . . . , a™ + 6m, . . . , ai + 6i, ao + bo) 

Furthermore, the product fg is the polynomial 

fg = ( . . . , 0, anbm, . . . , aibo + aobi, Oobo) 
that is, the kth coefl5cient Ck of fg is 

Cfc = 2^ ttibk-i = aobk + aibk-i + • • • + akbo 

i=0 

The following theorem applies. 

Theorem C.l: The set P of polynomials over a field K under the above operations of addi- 
tion and multiplication forms a commutative ring with a unit element 
and with no zero divisors, i.e. an integral domain. If / and g are nonzero 
polynomials in P, then deg (fg) = (deg /)(deg g). 

327 



328 POLYNOMIALS OVER A FIELD [APPENDIX C 

NOTATION 

We identify the scalar ao G X with the polynomial 

ao = ( . . . , 0, tto) 

We also choose a symbol, say t, to denote the polynomial 

t = (...,0,1,0) 

We call the symbol t an indeterminant. Multiplying t with itself, we obtain 

t' = {.. ., 0, 1, 0, 0), t' = (. . ., 0, 1, 0, 0, 0), ... 

Thus the above polynomial / can be written uniquely in the usual form 

/ = Unt" + ■ • • + ait + ao 

When the symbol t is selected as the indeterminant, the ring of polynomials over K is 
denoted by 

and a polynomial / is frequently denoted by f(t). 

We also view the field X as a subset of K[t] under the above identification. This is pos- 
sible since the operations of addition and multiplication of elements of K are preserved 
under this identification: 

(...,0, ao) + (..., 0,6o) = {...,0, ao + bo) 

(..., 0,ao) •(..., 0, 6o) - (...,0, aobo) 

We remark that the nonzero elements of K are the units of the ring K[t]. 

We also remark that every nonzero polynomial is an associate of a unique monic poly- 
nomial. Hence if d and d' are monic polynomials for which d divides d' and d' divides d, 
then d = d'. (A polynomial g divides a polynomial / if there is a polynomial k such that 
/ = hg.) 

DIVISIBILITY 

The following theorem formalizes the process known as "long division". 

Theorem C.2 (Division Algorithm) : Let / and g be polynomials over a field K with g ¥=0. 

Then there exist polynomials q and r such that 

/ = qg + r 

where either r = or deg r < deg g. 

Proof: If f — or if deg / < deg g, then we have the required representation 

f = Og + f 
Now suppose deg / — deg g, say 

/ = Unt" + • • • +ait + ao and g = hmt^ + ■ ■ ■ + hit + bo 

where a„, bm ?^ and n — m. We form the polynomial 

Om 

Then deg /i < deg /. By induction, there exist polynomials qi and r such that 

/i = qig + r 



APPENDIX C] POLYNOMIALS OVER A FIELD 329 

where either r = or degr < deg g. Substituting this into {1) and solving for /, 

/ = (qi+^t^-Ag + r 
which is the desired representation. 

Theorem C.3: The ring K[t] of polynomials over a field X is a principal ideal ring. If / is 
an ideal in K[t], then there exists a unique monic polynomial d which gen- 
erates /, that is, such that d divides every polynomial / G 7. 

Proof. Let d be a polynomial of lowest degree in 7. Since we can multiply d by a non- 
zero scalar and still remain in 7, we can assume without loss in generality that d is a monic 
polynomial. Now suppose / G 7. By Theorem C.2 there exist polynomials q and r such that 

/ = qd + r where either r = or deg r < deg d 

Now f,d G I implies qd G I and hence r = f — qd E I. But d is a polynomial of lowest 
degree in 7. Accordingly, r = and / = qd, that is, d divides /. It remains to show that 
d is unique. If d' is another monic polynomial which generates I, then d divides d' and d' 
divides d. This implies that d = d', because d and d' are monic. Thus the theorem is 
proved. 

Theorem C.4: Let / and g be nonzero polynomials in K[t]. Then there exists a unique 
monic polynomial d such that: (i) d divides / and g; and (ii) if d' divides 
/ and g, then d' divides d. 

Definition: The above polynomial d is called the greatest common divisor of / and g. If 
d — 1, then / and g are said to be relatively prime. 

Proof of Theorem CA. The set 7 = {mf + ng : m,nG K[t]} is an ideal. Let d be the 
monic polynomial which generates 7. Note f,g G I; hence d divides / and g. Now suppose 
d' divides / and g. Let / be the ideal generated by d'. Then f,g GJ and hence Icj. 
Accordingly, d Gj and so d' divides d as claimed. It remains to show that d is unique. 
If di is another (monic) greatest common divisor of / and g, then d divides di and di divides 
d. This implies that d — di because d and di are monic. Thus the theorem is proved. 

Corollary C.5: Let d be the greatest common divisor of the polynomials / and g. Then 
there exist polynomials m and n such that d = mf + ng. In particular, if 
/ and g are relatively prime then there exist polynomials m and n such 
that mf + ng = 1. 

The corollary follows directly from the fact that d generates the ideal 

7 = {mf + ng:m,nGK[t]} 

FACTORIZATION 

A polynomial p G K[t] of positive degree is said to be irreducible if p — fg implies 
/ or gr is a scalar. 

Lemma C.6: Suppose p G K[t] is irreducible. If p divides the product fg of polynomials 
f,g G K[t], then p divides f or p divides g. More generally, if p divides the 
product of n polynomials /1/2. . .fn, then p divides one of them. 

Proof. Suppose p divides fg but not /. Since p is irreducible, the polynomials / and 
p must then be relatively prime. Thus there exist polynomials m,nG K[t] such that 
mf + np — 1. Multiplying this equation by g, we obtain m,fg + npg = g. But p divides fg 
and so mfg, and p divides npg; hence p divides the sum g = mfg + npg. 



?-i«i./i>'— --r-- T .Tii?-«'-TJ?»7^T^-i-- ""■"■" "■'??^'?2?CT5^-';fTr.js"'w«-5.a 



830 POLYNOMIALS OVER A FIELD [APPENDIX C 

Now suppose p divides /1/2. . ./«. If p divides /i, then we are through. If not, then by 
the above result p divides the product /2. . ./«. By induction on n, p divides one of the poly- 
nomials A, . . . , /«. Thus the lemma is proved. 

Theorem C.7 (Unique Factorization Theorem): Let / be a nonzero polynomial in K[t]. 
Then / can be written uniquely (except for order) as a product 

/ = kpiP2. . .Pn 

where k GK and the Pi are monic irreducible polynomials in K[t]. 

Proof: We prove the existence of such a product first. If / is irreducible or if / e K, 
then such a product clearly exists. On the other hand, suppose f = gh where / and g are 
nonscalars. Then g and h have degrees less than that of /. By induction, we can assume 

g - kigig2...gr and h — kihihi. . .hs 
where ki, kiGK and the gi and hj are monic irreducible polynomials. Accordingly, 

/ = {kik%)gig2. . .grhjii. . .hs 
is our desired representation. 

We next prove uniqueness (except for order) of such a product for /. Suppose 

/ = kpiP2. . .Pn = k'qiQi. . .Qm 

where k,k' E.K and the Pu . . ., Pn, qi, . . .,qm are monic irreducible polynomials. Now pi 
divides k'qi . . .qm. Since pi is irreducible it must divide one of the qi by the above lemma. 
Say pi divides qi. Since pi and qi are both irreducible and monic, pi = qi. Accordingly, 

kpi. . .pn = k'qi. . .qm 

By induction, we have that n = m and P2 = 92, . . ., Pn = qm for some rearrangement of 
the qi. We also have that k = k'. Thus the theorem is proved. 

If the field K is the complex field C, then we have the following result which is known 
as the fundamental theorem of algebra; its proof lies beyond the scope of this text. 

Theorem C.8 (Fundamental Theorem of Algebra): Let /(<) be a nonzero polynomial 
over the complex field C. Then f{t) can be written uniquely (except for 
order) as a product 

/(*) = k{t-ri){t-r2)---it-rn) 

where k, n G C, i.e. as a product of linear polynomials. 

In the case of the real field B we have the following result. 

Theorem C.9: Let f{t) be a nonzero polynomial over the real field R. Then f{f) can be 
written uniquely (except for order) as a product 

f{t) = kpi{t)p2{t) ■■■Pm{t) 

where kGK and the Pi{t) are monic irreducible polynomials of degree one 
or two. 



INDEX 



Abelian group, 320 
Absolute value, 4 
Addition, 

in R", 2 

of linear mappings, 128 

of matrices, 36 
Adjoint, 

classical, 176 

operator, 284 
Algebra, 

isomorphism, 169 

of linear operators, 129 

of square matrices, 43 
Algebraic multiplicity, 203 
Alternating, 

bilinear forms, 262 

multilinear forms, 178, 277 
Angle between vectors, 282 
Annihilator, 227, 251 
Anti-symmetric 

bilinear form, 263 

operator, 285 
Augmented matrix, 40 

Basis, 88 

change of, 153 
Bessel's inequality, 309 
Bijective mapping, 123 
Bilinear form, 261, 277 
Binary relation, 318 
Block matrix, 45 
Bounded function, 65 

C,4 

C», 5 

Cayley-Hamilton theorem, 201, 211 

Canonical forms in 

Euclidean spaces, 288 

unitary spaces, 290 

vector spaces, 222 
Cauchy-Schwarz inequality, 4, 10, 281 
Cells, 45 

Change of basis, 153 
Characteristic, 

equation, 200 

matrix, 200 

polynomial, 200, 203, 210 

value, 198 

vector, 198 
Classical adjoint, 176 
Co-domain, 121 
Coefficient matrix, 40 
Cofactor, 174 



Column, 

of a matrix, 35 

rank, 90 

space, 67 

vector, 36 
Companion matrix, 228 
Complex numbers, 4 
Components, 2 

Composition of mappings, 121 
Congruent matrices, 262 
Conjugate complex number, 4 
Consistent linear equations, 31 
Convex, 260 
Coordinate, 2 

vector, 92 
Coset, 229 
Cramer's rule, 177 
Cyclic group, 325 
Cyclic subspaces, 227 

Decomposition, 

direct sum, 224 

primary, 225 
Degenerate bilinear form, 262 
Dependent vectors, 86 
Determinant, 171 
Determinantal rank, 195 
Diagonal 

matrix, 43 

of a matrix, 43 
Diagonalization, 

Euclidean spaces, 288 

unitary spaces, 290 

vector spaces, 155, 199 
Dimension, 88 
Direct sum, 69, 82, 224 
Disjoint, 316 
Distance, 3, 280 
Distinguished elements, 41 
Division algorithm, 328 
Domain, 

integral, 322 

of a mapping, 121 
Dot product, 

in C", 6 

in R", 3 
Dual 

basis, 250 

space, 249 

Echelon form, 
linear equations, 21 
matrices, 41 



331 



332 



INDEX 



Echelon matrix, 41 
Eigenspace, 198, 205 
Eigenvalue, 198 
Eigenvector, 198 
Element, 315 
Elementary, 

column operation, 61 

divisors, 229 

matrix, 56 

row operation, 41 
Elimination, 20 
Empty set, 315 
Equality 

of matrices, 36 

of vectors, 2 
Equations (see Linear equations) 
Equivalence relation, 318 
Equivalent matrices, 61 
Euclidean space, 3, 279 
Even 

function, 83 

permutation, 171 
External direct sum, 82 

Field, 323 
Free variable, 21 
Function, 121 
Functional, 249 

Gaussian integers, 326 

Generate, 66 

Geometric multiplicity, 203 

Gram-Schmidt orthogonalization, 283 

Greatest common divisor, 329 

Group, 320 

Hermitian, 

form, 266 

matrix, 266 
Hilbert space, 280 
Hom (V, U), 128 

Homogeneous linear equations, 19 
Homomorphism, 123 
Hyperplane, 14 

Ideal, 322 
Identity, 

element, 320 

mapping, 123 

matrix, 43 

permutation, 172 
Image, 121, 125 
Inclusion mapping, 146 
Independent 

subspaces, 244 

vectors, 86 
Index 

of nilpotency, 225 

set, 316 
Infective mapping, 123 
Inner product, 279 
Inner product space, 279 
Integers modulo re, 323 



Integral domain, 322 
Intersection of sets, 316 
Invariant subspace, 223 
Inverse, 

mapping, 123 

matrix, 44, 176 
Invertible, 

linear operator, 130 

matrix, 44 
Irreducible, 323, 329 
Isomorphism of 

algebras, 169 

groups, 321 

inner product spaces, 286, 311 

vector spaces, 93, 124 

Jordan canonical form, 226 

Kernel, 123, 321, 326 

22-space, 280 

Line segment, 14, 260 

Linear combination 

of equations, 30 

of vectors, 66 
Linear dependence, 86 

in R", 28 
Linear equations, 18, 127, 176, 251, 282 
Linear functional, 249 
Linear independence, 86 

in R", 28 
Linear mapping, 123 

matrix of, 160 

rank of, 126 
Linear operators, 129 
Linear span, 66 

Mapping, 121 

linear, 123 
Matrices, 35 

addition, 36 

augmented, 40 

block, 45 

change of basis, 153 

coefficient, 40 

column, 35 

congruent, 262 

determinant, 171 

diagonal, 43 

echelon, 41 

equivalent, 61 

Hermitian, 266 

identity, 43 

multiplication, 39 

normal, 290 

rank, 90 

row, 35 

row canonical form, 42, 68 

row equivalent, 41 

row space, 60 

scalar, 43 

scalar multiplication, 36 

similar, 155 

size, 35 



INDEX 



333 



Matrices (cont.) 

square, 43 

symmetric, 65, 288 

transition, 153 

transpose, 39 

triangular, 43 

zero, 37 
Matrix representation, 

bilinear forms, 262 

linear mappings, 150 
Maximal independent set, 89 
Minimal polynomial, 202, 212 
Minkowski's inequality, 10 
Minor, 174 
Module, 323 
Monic polynomial, 201 
Multilinear, 178, 277 
Multiplication of matrices, 37, 39 

N (positive integers), 315 

n-space, 2 

n-tuple, 2 

Nilpotent, 225 

Nonnegative semi-definite, 266 

Nonsingular, 

linear mapping, 127 

matrix, 130 
Norm, 279 

in R«, 4 
Normal operator, 286, 290, 303 
Normal subgroup, 320 
Normalized vector, 280 
Null set, 315 
Nullity, 126 

Odd, 

function, 73 

permutation, 171 
One-to-one mappings, 123 
Onto mappings, 123 
Operations with linear mappings, 128 
Operators (see Linear operators) 
Ordered pair, 318 
Orthogonal 

complement, 281 

matrix, 287 

operator, 286 

vectors, 3, 280 
Orthogonally equivalent, 288 
Orthonormal, 282 

Parallelogram law, 307 
Parity, 171 
Partition, 319 
Permutations, 171 
Polar form, 264, 307 
Polynomials, 327 
Positive 

matrix, 310 

operator, 288 
Positive definite, 

bilinear form, 265 

matrix, 272, 310 

operator, 288 



Primary decomposition theorem, 225 

Prime ideal, 326 

Principal ideal, 322 

Principal minor, 219 

Product set, 317 

Projection operator, 243, 308 

orthogonal, 281 
Proper 

subset, 316 

value, 198 

vector, 198 

Q (rational numbers), 315 
Quadratic form, 264 
Quotient, 

group, 320 

module, 326 

ring, 322 

set, 319 

space, 229 

K (real field), 315 

R", 2 

Rank, 

bilinear form, 262 

linear mapping, 126 

matrix, 90, 195 
Rational canonical form, 228 
Relation, 318 
Relatively prime, 329 
Ring, 322 
Row, 

canonical form, 42 

equivalent matrices, 41 

of a matrix, 35 

operations, 41 

rank, 90 

reduced echelon form, 41 

reduction, 42 

vector, 36 

Scalar, 2, 63 

mapping, 219 

matrix, 43 
Scalar multiplication, 69 

of linear mappings, 128 

of matrices, 36 
Second dual space, 251 
Self -adjoint operator, 285 
Set, 315 
Sgn, 171 

Sign of a permutation, 171 
Signature, 265, 266 
Similar matrices, 155 
Singular mappings, 127 
Size of a matrix, So 
Skew-adjoint operator, 285 
Skew-symmetric bilinear form, 263 
Solution, 

of linear equations, 18, 23 

space, 65 
Span, 66 

Spectral theorem, 291 
Square matrices, 43 



J 



334 



INDEX 



Subgroup, 320 

Subring, 322 

Subset, 315 

Subspace (of a vector space), 65 

sum of, 68 
Surjective mapping, 123 
Sylvester's theorem, 265 
Symmetric, 

bilinear form, 263 

matrix, 65 

operator, 285, 288, 300 
System of linear equations, 19 

Trace, 155 

Transition matrix, 153 
Transpose, 

of a linear mapping, 252 

of a matrix, 39 
Transposition, 172 
Triangle inequality, 293 
Triangular, 

form, 222 

matrix, 43 
Trivial solution, 19 

Union of sets, 316 



Unique factorization, 323 
Unit vector, 280 
Unitarily equivalent, 288 
Unitary, 

matrix, 287 

operator, 286 

space, 279 
Universal set, 316 
Upper triangular matrix, 43 
Usual basis, 88, 89 

Vector, 63 

in C", 5 

in R", 2 
Vector space, 63 
Venn diagram, 316 

Z (integers), 315 

Z„ (ring of integers modulo w), 323 

Zero, 

mapping, 124 

matrix, 37 

of a polynomial, 44 

solution, 19 

vector, 3, 63 




COLLEGE PHYSICS 

including 625 SOLVED PROBLEMS 
Edited by CAREL W. van der MERWE, Pli.D., 
Professor of Physics, New ^ork Univtrsity 

COLLEGE CHEMISTRY 

including 385 SOLVED PROBLEMS 
Edited by JEROME L. ROSENBERG, Pli.D., 
Professor of Chemisfry, University of Pitisburgh 

GENETICS 

including 500 SOLVED PROBLEMS 
By WILLIAM D. STANSFIELD, Ph.D., 
Depf. of iiological Sciences, Calif. Slate Polyteclt. 

MATHEMATICAL HANDBOOK 

including 2400 FORMULAS and 60 TABLES 
By MURRAY R. SPIEGEL, Ph.D., 

Professor of Math., Rensselaer Polytech. Inst. 

First Yr. COLLEGE MATHEMATICS 

including 1850 SOLVED PROBLEMS 

By FRANK AYRES, Jr., Ph.D., 

Professor of Mathematics, Dickinson College 

COLLEGE ALGEBRA 

including 1 940 SOLVED PROBLEMS 
By MURRAY R. SPIEGEL, Ph.D., 

Professor of Math., Rensselaer Polytech. Inst. 

TRIGOmMETRY 

including 680 SOLVED PROBLEMS 
By FRANK AYRES, Jr., Ph.D., 

Professor of Mathematics, Dickinson College 

MATHEMATICS OF FINANCE 

including 500 SOLVED PROBLEMS 

By FRANK AYRES, Jr., Ph.D., 

Professor of Mathematics, Dickinson College 

PROBABILITY 

including 500 SOLVED PROBLEMS 
By SEYMOUR IIPSCHUTZ, Ph.D., 

Assoc. Prof, of Moth., 7emp/e University 

STATISTICS 

including 875 SOLVED PROBLEMS 
By MURRAY R. SPIEGEL, Ph.D., 

Professor of Math., Rensselaer Polytech. Inst. 

ANALYTIC GEOMETRY 

including 345 SOLVED PROBLEMS 
By JOSEPH H. KINDLE, Ph.D., 
Professor of Mathematics, University of Cincinnati 

DIFFERENTIAL GEOMETRY 

including 500 SOLVED PROBLEMS 

By MARTIN LIPSCHUTZ, Ph.D., 
Professor of Mathematics, University of Bridgeport 

CALCULUS 

including 1 1 75 SOLVED PROBLEMS 
By FRANK AYRES, Jr., Ph.D., 

Professor of Mathemolics, Dickinson College 

DIFFERENTIAL EQUATIONS 

including 560 SOLVED PROBLEMS 
By FRANK AYRES, Jr., Ph.D., 

Professor of Mathematics, Dickinson College 

SET THEORY and Related Topics 

including 530 SOLVED PROBLEMS 
By SEYMOUR IIPSCHUTZ, Ph.D., 

Assoc. Prof, of Moth., Temple University 



FINITE MATHEMATICS 

including 750 SOLVED PROBLEMS 
By SEYMOUR LIPSCHUTZ, Ph.D., 

Assoc. Prof, of Math., Temple University 

MODERN ALGEBRA 

including 425 SOLVED PROBLEMS 

By FRANK AYRES, Jr., Ph.D., 

Professor of Mathematics, Dickinson College 

LINEAR ALGEBRA 

including 600 SOLVED PROBLEMS 

By SEYMOUR LIPSCHUTZ, Ph.D., 

Assoc. Prof, of Math., Temple University 

MATRICES 

including 340 SOLVED PROBLEMS 

By FRANK AYRES, Jr., Ph.D., 

Professor of Wothemolics, Diclcinson College 

PROJECTIVE GEOMETRY 

including 200 SOLVED PROBLEMS 

By FRANK AYRES, Jr., Ph.D., 

Professor of Mathematics, Dickinson College 

GENERAL TOPOLOGY 

including 650 SOLVED PROBLEMS 

By SEYMOUR LIPSCHUTZ, Ph.D., 

Assoc. Prof, of Math., Temple University 

GROUP THEORY 

including 600 SOLVED PROBLEMS 

By B. BAUMSLAG, B. CHANDLER, Ph.D., 

Mathematics Dept., New York University 

VECTOR ANALYSIS 

including 480 SOLVED PROBLEMS 

By MURRAY R. SPIEGEL, Ph.D., 

Professor of Math., Rensselaer Polytech. Inst. 

ADVANCED CALCULUS 

including 925 SOLVED PROBLEMS 
By MURRAY R. SPIEGEL, Ph.D., 

Professor of Math., Rensselaer Polytech. Inst. 

COMPLEX VARIABLES 

including 640 SOLVED PROBLEMS 
By MURRAY R. SPIEGEL, Ph.D., 

Professor of Math,, Rensselaer Polytech. Inst. 

LAPLACE TRANSFORMS 

including 450 SOLVED PROBLEMS 
By MURRAY R. SPIEGEL, Ph.D., 

Professor of Math., Rensselaer Polytech. Inst. 

NUMERICAL ANALYSIS 

including 775 SOLVED PROBLEMS 

By FRANCIS SCHEID, Ph.D., 

Professor of Mathematics, Boston University 

DESCRIPTIVE GEOMETRY 

including 1 75 SOLVED PROBLEMS 
By MINOR C. HAWK, Heod of 
Engineering Graphics Dept., Carnegie Inst, of Tech. 

ENGINEERING MECHANICS 

including 460 SOLVED PROBLEMS 
By W. G. McLEAN, B.S. in E.E., M.S., 

Professor of Mechanics, Lafayette College 

and E. W. NELSON, B.S. in M.E., M. Adm. E., 

Engineering Supervisor, Western Electric Co. 

THEORETICAL MECHANICS 

including 720 SOLVED PROBLEMS 
By MURRAY R. SPIEGEL, Ph.D., 

Professor of Math., Rensselaer Polytech. Inst. 



LAGRANGIAN DYNAMICS 

including 275 SOLVED PROBLEMS 

By D. A. WELLS, Ph.D., 

Professor of Physics, University at Cincinnati 

STRENGTH OF MATERIALS 

including 430 SOLVED PROBLEMS 

By WILLIAM A. NASH, Ph.D., 
Professor of Eng. Mechanics, University of Florida 

FLUID MECHANICS and HYDRAULICS 

including 475 SOLVED PROBLEMS 

By RANALD V. GILES, B.S., M.S. in C.E., 

Prof, of Civil Engineering, Drexel Inst, of Tech. 

FLUID DYNAMICS 

including 100 SOLVED PROBLEMS 

By WILLIAM F. HUGHES, Ph.D., 

Professor of Mech. Eng., Carnegie Inst, of Tech. 
and JOHN A. BRIGHTON, Ph.D., 

Asst. Prof, of Mech. Eng., Pennsylvania State U. 

ELECTRIC CIRCUITS 

including 350 SOLVED PROBLEMS 
By JOSEPH A. EDMINISTER, M.S.E.E., 

Assoc. Prof, of Elec. Eng., University of Altron 

ELECTRONIC CIRCUITS 

including 160 SOLVED PROBLEMS 
By EDWIN C. lOWENBERG, Ph.D., 

Professor of Elec. Eng., University of Nebraska 

FEEDBACK & CONTROL SYSTEMS 

including 680 SOLVED PROBLEMS 

By J. J. DiSTEFANO III, A. R. STUBBERUD, 
and I. J. WILLIAMS, Ph.D., 
Engineering Dept., University of Calif., at L.A. 

TRANSMISSION LINES 

including 165 SOLVED PROBLEMS 

By R. A. CHIPMAN, Ph.D., 
Professor of Electrical Eng., University of Toledo 

REINFORCED CONCRETE DESIGN 

including 200 SOLVED PROBLEMS 

By N. J. EVERARD, MSCE, Ph.D., 
Prof, of Eng. Mech. t Struc, Arlington State Col. 
and J. L. TANNER III, MSCE, 

Technical Consultant, Texas Industries Inc. 

MECHANICAL VIBRATIONS 

including 225 SOLVED PROBLEMS 

By WILLIAM W. SETO, B.S. in M.E., M.S., 
Assoc. Prof, of Mech. Eng., San Jose State College 

MACHINE DESIGN 

including 320 SOLVED PROBLEMS 

By HALL, HOLOWENKO, LAUGHIIN 

Professors of Mechanical Eng., Purdue University 

BASIC ENGINEERING EQUATIONS 

including 1400 BASIC EQUATIONS 

By W. F. HUGHES, E. W. GAYLORD, Ph.D., 
Professors of Mech. Eng., Carnegie Inst,' of Tech. 

ELEMENTARY ALGEBRA 

including 2700 SOLVED PROBLEMS 

By BARNEn RICH, Ph.D., 

Head of Math. Oepf., Brooklyn Tech. H.S. 

PLANE GEOMETRY 

Including 850 SOLVED PROBLEMS 

By BARNEn RICH, Ph.D., 

Head of Math. Dept., Brooklyn Tech. H.S. 

TEST ITEMS IN EDUCATION 

including 3100 TEST ITEMS 

By G. J. MOULY, Ph.D., L. E. WALTON, Ph.D., 

Professors of Education, University of Miami 




I 



II